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Abstract. In this paper we explore how concepts of high-dimonsional data compression via 
random projections onto lower-dimensional spaces can be applied for tractable simulation of cer- 
tain dynamical systems modeling complex interactions. In such systems, one has to deal with a 
large number of agents (typically millions) in spaces of parameters describing each agent of high 
dimension (thousands or more). Even with today's powerful computers, numerical simulations of 
such systems are prohibitively expensive. We propose an approach for the simulation of dynami- 
cal systems governed by functions of adjacency matrices in high dimension, by random projections 
via Johnson-Lindenstrauss embeddings, and recovery by compressed sensing techniques. We show 
how these concepts can be generalized to work for associated kinetic equations, by addressing the 
phenomenon of the delayed curse of dimension, known in information-based complexity for optimal 
numerical integration problems and measure quantization in high dimensions. 
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1. Introduction. The dimensionality scale of problems arising in our modern 
information society has become very large and finding appropriate methods for dealing 
with them is one of the great challenges of today's numerical simulation. The most 
notable recent advances in data analysis are based on the observation that in many 
situations, even for very complex phenomena, the intrinsic dimensionality of the data 
is significantly lower than the ambient dimension. Remarkable progresses have been 
made in data compression, processing, and acquisition. We mention, for instance, 
the use of diffusion maps for data clouds and graphs in high dimension [5j [6l [191 HOl 
dU 133] in order to define low-dimensional local representations of data with small 
distance distortion, and meaningful automatic clustering properties. In this setting 
the embedding of data is performed by a highly nonlinear procedure, obtained by 
computing the eigenfunctions of suitable normalized diffusion kernels, measuring the 
probability of transition from one data point to another over the graph. 

Quasi-isometrical linear embeddings of high-dimensional point clouds into low- 
dimensional spaces of parameters are provided by the well-known Johnson-Lindenstrauss 
Lemma [I] [521 SO] '■ any cloud of N points in can be embedded by a random lin- 
ear projection M nearly isometrically into M'^ with k = C(e~^ log(A^)) (a precise 
statement will be given below). This embedding strategy is simpler than the use of 
diffusion maps, as it is linear, however it is "blind" to the specific geometry and local 
dimensionality of the data, as the embedding dimension k depends exclusively on the 
number of points in the cloud. In many applications, this is sufficient, as the number 
of points M is supposed to be a power of the dimension d, and the embedding produces 
an effective reduction to fc = 0{e~^ \og(N)) = 0{e~^ log(d)) dimensions. As clarified 
in [21 [31], the Johnson-Lindenstrauss Lemma is also at the basis of the possibility 
of performing optimal compressed and nonadaptive acquisition of high-dimensional 
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data. In compressed sensing [H [13 a vector a; G M'* is encoded in a vector y e M*^ by 
applying a random projection M, which is modehng a hnear acquisition device with 
random sensors, i.e., y = Mx. From y it is possible to decode x approximately (see 
Theorem 13 . 71 below) by solving the convex optimization problem 



x^ = arg mm 

Mz=y 




with the error distortion 



where aK{x)id ~ 'v[^iz:#sui:>p(z)<K \\z ~ ^Wif and K ~ 0{k/{\og{d/k) + 1)). We de- 
note = {z g R'' : #supp (z) < K} the set of A'-sparse vectors, i.e., the union of 
iC-dimensional coordinate subspaces in W^. In particular, if a; € E/^ , then x"^ = x. 
Hence, not only is M a Johnson-Lindenstrauss embedding, quasi-isometrical on point 
clouds and i^-dimensional coordinate subspaces, but also allows for the recovery of 
the most relevant components of high-dimensional vectors, from low-dimensional en- 
coded information. A recent work [H [59] extends the quasi-isometrical properties of 
the Johnson-Lindenstrauss embedding from point clouds and .ftT-dimcnsional coordi- 
nate subspaces to smooth compact Riemannian manifolds with bounded curvature. 
Inspired by this work, in [39] the authors extend the principles of compressed sensing 
in terms of point recovery on smooth compact Riemannian manifolds. 

Besides these relevant results in compressing and coding-decoding high-dimensional 
"stationary" data, dimensionality reduction of complex dynamical systems and high- 
dimensional partial differential equations is a subject of recent intensive research. 
Several tools have been employed, for instance, the use of diffusion maps for dynam- 
ical systems [48], tensor product bases and sparse grids for the numerical solution of 
linear high-dimensional PDEs [261 UHl [M] [35] , the reduced basis method for solving 
high-dimensional parametric PDEs [7] IQ] l46 l [53 ] [54 ] [56 ] . 

In this paper we shall further explore the connection between data compression and 
tractable numerical simulation of dynamical systems. Eventually we address the so- 
lutions of associated high-dimensional kinetic equations. We are specially interested 
in dynamical systems of the type 

N 

x,{t) - h{Vx{t)) + f,,{Vx{t))x,{t), (1.1) 
i=i 

where we use the following notation: 
9 N E N - number of agents, 

• x{t) = {xi{t),...,XN{t)) G W^""^, where x^ : [0,T] R'', i = 1, . . . , N , 

pNxN 



./.jiM^x^^R, i,j^l,...,N, 

• V : R'^^^ -> R^^^, Vx := {\\xi - Xj\\gd)fj^^ is the adjacency matrix of the 
point cloud x. 

We shall assume that the governing functions fi and are Lipschitz, but we shall 
specify the details later on. The system (|l.ip describes the dynamics of multiple com- 
plex agents x{t) = {xi{t), . . . ,XN{t)) E M''^^, interacting on the basis of their mutual 
"social" distance 'Dx{t), and its general form includes several models for swarming and 
collective motion of animals and micro-organisms, aggregation of cells, etc. Several 
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relevant effects can be included in the model by means of the functions fi and , 
in particular, fundamental binary mechanisms of attraction, repulsion, aggregation, 
self-drive, and alignment [T31 [HI [231 EOl HlI ■ Moreover, possibly adding stochastic 
terms of random noise may also allow to consider diffusion effects [U [14]. How- 
ever, these models and motion mechanisms are mostly derived borrowing a leaf from 
physics, by assuming the agents (animals, micro-organisms, cells etc.) as pointlike 
and exclusively determined by their spatial position and velocity in M'' for cZ = 3 -I- 3. 
In case we wished to extend such models of social interaction to more "sophisticated" 
agents, described by many parameters (d ^ 3 -I- 3), the simulation may become com- 
putationally prohibitive. Our motivation for considering high-dimensional situations 
stems from the modern development of communication technology and Internet, for 
which we witness the development of larger and larger communities accessing infor- 
mation (interactive databases), services (financial market), social interactions (social 
networks) etc. For instance, we might be interested to simulate the behavior of cer- 
tain subsets of the financial market where the agents are many investors, who are 
characterized by their portfolios of several hundreds of investments. The behavior of 
each individual investor depends on the dynamics of others according to a suitable 
social distance determined by similar investments. Being able to produce meaningful 
simulations and learning processes of such complex dynamics is an issue, which might 
be challenged by using suitable compression/dimensionality reduction techniques. 
The idea we develop in this paper is to randomly project the system and its initial 
condition by Johnson-Lindenstrauss embeddings to a lower-dimensional space where 
an independent simulation can be performed with significantly reduced complexity. 
We shall show that the use of multiple projections and parallel computations allows 
for an approximate reconstruction of the high-dimensional dynamics, by means of 
compressed sensing techniques. After we explore the tractable simulation of the dy- 
namical systems (jl.ip when the dimension d of the parameter space is large, we also 
address the issue of whether we can perform tractable simulations when also the num- 
ber N of agents is getting very large. Unlike the control of a finite number of agents, 
the numerical simulation of a rather large population of interacting agents {N ^ 0) 
can constitute a serious difficulty which stems from the accurate solution of a pos- 
sibly very large system of ODEs. Borrowing the strategy from the kinetic theory of 
gases [16], we want instead to consider a density distribution of agents, depending on 
their d-parameters, which interact with stochastic infiuence (corresponding to classi- 
cal collisional rules in kinetic theory of gases) - in this case the infiuence is "smeared" 
since two individuals may interact also when they are far apart in terms of their 
"social distance" Vx. Hence, instead of simulating the behavior of each individual 
agent, we shall describe the collective behavior encoded by a density distribution 
whose evolution is governed by one sole mesoscopic partial differential equation. We 
shall show that, under realistic assumptions on the concentration of the measure /x 
on sets of lower dimension, we can also acquire information on the properties of the 
high-dimensional measure solution ^ of the corresponding kinetic equation, by con- 
sidering random projections to lower dimension. Such approximation properties are 
determined by means of the combination of optimal numerical integration principles 
for the high-dimensional measure |331 136] and the results previously achieved for 
particle dynamical systems. 
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1.1. Fundamental assumptions. We introduce the following notation for £p- 
norms of vectors v € R'', 




and 



•= Z^l"''l^ for l<p<c50, 



IK'lUl := max I?;, 

t—l,...,d 



For matrices x G W^^"^ we consider the mixed norm 

Mi-a^) ■■=\\Mi-)Zi\h, 

where Xi e M" is the i*''-column of the matrix x. 

For the rest of the paper we impose three fundamental assumptions about Lips- 
chitz and boundedness properties of fi and fij, 

|/.(a)-/.(6)| <i||a-6|lf«(,«), i-l,...,iV (1.2) 

N 



i=l,...,N ^ 



max > |/,;, (a)| < L', (1.3) 

. , „E - =^ - ^W'^-MY (1-4) 



max 



for every a,b £ M^^^. Unfortunately, models of real-life phenomena would not 
always satisfy these conditions, for instance models of financial markets or socio- 
economic interactions can be expected to exhibit severely discontinuous behavior. 
However, these assumptions are reasonable in certain regimes and allow us to prove 
the concept we are going to convey in this paper, i.e., the possibility of simulating 
high-dimensional dynamics by multiple independent simulations in low dimension. 

1.2. Euler scheme, a classical result of stability and convergence, and 
its complexity. We shall consider the system of ordinary differential equations of 
the form p.ip with the initial condition 

x,iO) = x°, i = l,...,N. (1.5) 

The Euler method for this system is given by (|1.5p and 



X. 



7+1 ■.= x? + h 



N 

MVx")+Y,h{'D^n^" 



n = 0, . . . , no — 1. (1-6) 



where h > is the time step and no '■= T/h is the number of iterations. We consider 
here the explicit Euler scheme exclusively for the sake of simplicity, for more sophisti- 
cated integration methods might be used. We start with a classical result, which we 
report in detail for the sake of the reader, and for simplicity we assume fij = for all 
iJ = l,...N. 

The simulation of the dynamical system (|1.6p has a complexity which is at least 
the one of computing the adjacency matrix Vx^ at each discrete time t", i.e., 0{d x 
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N'^). The scope of the next sections is to show that, up to an e-distortion, we can 
approximate the dynamics of by projecting the system into lower dimension and 
by executing in parallel computations with reduced complexity. Computation of the 
adjacency matrix in the new dimension requires only 0(e~^ log{N) x iV^) operations. 
Especially if the distortion parameter e > is not too small and the number of agents 
is of a polynomial order in d, we reduce the complexity of computing the adjacency 
matrix to 0(log(d) x N^). 

2. Projecting the Euler method: dimensionality reduction of discrete 
dynamical systems. 

2.1. Johnson-Lindenstrauss embedding. We wish to project the dynamics 
of into a lower-dimensional space by employing a well-known result of Johnson 
and Lindenstrauss [40], which we informally rephrase for our purposes as follows. 

Lemma 2.1 (Johnson and Lindenstrauss). Let V be an arbitrary set ofAf points 
in M**. Given a distortion parameter e > 0, there exists a constant 

fco = 0(e-'log(AA)), 

such that for all integers k > ko, there exists a k x d matrix M for which 

(1 - e)||x - < \\Mx - Mi\\]^ < (1 + e)||a: - (2.1) 

*-2 *-2 ^2 

for all x^ x ^ V . It is easy to sec that the condition 

(1 - e)\\pf,, < ||A/p||| < (1 + e)||p|||, p e (2.2) 

implies 

i^-^)\\p\\ii<\\Mp\\e.<{l + s)\\ph., peR^, (2.3) 

for < e < 1, which will be used in the following sections. On the other hand, (|2.3p 
implies (|2.2p with 3e instead of e. 

Our aim is to apply this lemma to dynamical systems. As the mapping M from 
Lemma 12.11 is linear and almost preserves distances between the points (up to the 
e > distortion as described above), we restrict ourselves to dynamical systems 
which are linear or whose non-linearity depends only on the mutual distances of the 
points involved, as in (|l.ip . 

Let us define the additional notation, which is going to be fixed throughout the 
paper: 

• d gN - dimension (large) , 

• e > - the distortion parameter from Lemma 12.11 

• A: G N - new dimension (small), 

• Af S R*-'^'' - randomly generated matrix as described below. 

The only constructions of a matrix M as in Lemma 12.11 known up to now are 
stochastic, i.e., the matrix is randomly generated and has the quasi-isometry property 
(|2.ip with high probability. We refer the reader to [53] and [TJ Theorem 1.1] for two 
typical versions of the Johnson-Lindenstrauss Lemma. 

We briefly collect below some well-known instances of random matrices, which 
satisfy the statement of Lemma 12.11 with high probability; 

• k X d matrices M whose entries mij are independent realizations of Gaussian 
random variables 

TOij - (o,i ) ; 
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k X d matrices M whose entries are independent realizations of ± Bernoulli 
random variables 



mi, 7 := 



+ -^, with probability ^ 
— with probability i 



Several other random projections suitable for Johnson-Lindenstrauss embeddings 
can be constructed following Theorem 13.61 recalled below, and we refer the reader to 
[33] for more details. 

2.2. Uniform estimate for a general model. If M G R*^^'' is a matrix, we 
consider the projected Euler method in R*^ associated to the high-dimensional system 
(fTSjl - pTe)) . namely 

y° := Mxl (2.4) 

N 



■■=y? + h 



A//,(2?'y") + ^/,,(I?'2/")y," 



71 = 0,..., 710-1. (2.5) 



We denote here V : R''^'^ M^^^, 2?'?; := {\\y, - yj\\t,k)^j=i, the adjacency matrix 
of the agents y = (j/i, . . . , y^) in K.'^^'^. The first result of this paper reads as follows. 
Theorem 2.2. Let the sequences 

{a;", i = 1, . . . , N and 71 = 0, ... , 7io} and {y", i — 1, . . . , N and n — 0, . . . , uq} 

be defined by pT5)) - pr^ and ((^ -(^3 1) with and satisfying pT^ - pTi)) and a 

matrix M e M''^'* with 

||A//,(2?V) - Mh{Vx-)\\,, <(! + £) ||/.(P'j/") - h{Vx-)h. , (2.6) 
||Mx;'||,. <(l + £)||x;'|i,., (2.7) 
(1 - e)\\x\^ - ^^hi < P/xr - Mx]\\,. < (1 + e)\\x: - x]\\,. (2.8) 

for all i, j = 1, . . . , and all n ~ 0, . . . ,nQ. Moreover, let us assume that 

a > max ||a;"||£d for all ?i = 0, ...,no, j — l,...,N. (2.9) 

j J 2 

Let 

el -Wy^ - Mx'IWi,,, t = l,...,N andn = 0,...,no (2.10) 



and set := max-; e". Then 



<ehnBcxpihnA), (2.11) 

where A:=L' + 2(1 + £){L + aL") and B := 2a(l + e)(L + ai"). 

We remark that conditions p.6p - p.8p are in fact satisfied as soon as M is a 
suitable Johnson-Lindenstrauss embedding as in Lemma l^TTl for the choice J\f — 2NnQ 
andfc = C'(e-2log(AA)). 
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Proof. Using ((^1^ and (HHJ-ldll) and combined with (US]) and 

we obtain 



N 



^/,,(I?'y")2/;-/,,(I?a;")Ma 



<er + Ml + £)||/.p'y")"/«(2?2;")ll,. 
<e- + h{l + e)mV'y^)-MVxnh. 

N 

+ hY,[\f,,{V'y")\e] + (1 + s)\\x]\\,. ■ IMV'yn ~ 
i=i 

Taking the maximum on both sides, this becomes 

We use (|1.2p - (|1.4p for a = V'y" and 5 = 2?x" to estimate all the terms on the 
right-hand side. This gives 

^n+i < ^ ^ e)L\\V'y'' - + h£''L' + h{l + e)aL"\\V'y" - 

< £"{1 + hL') + h{l + e){L + aL") [||2?'y" - 2?'A/a;"|lf« ) + ||I?'Afx" - (^«)] 

< £"(1 + /iL') + 2h{\ + £)(L + " + ae), 

where we used (|2.8p in the last line. This, together with £^ = 0, leads to 

£" < ehnBexp{hnA), 

where A -.^ L' + 2(1 + e){L + aL") and B 2a(l + £)(L + aL"). □ 

2.3. Uniform estimate for the Cucker-Smale model. As a relevant exam- 
ple, let us now show that Thcorcm l2.2l can be applied to the well-known Cucker-Smale 
model, introduced and analyzed in [22j [23] , which is described by 



1 ^ 



g{\\x,-Xj\\t,d){vj-Vi), i = l,. 



,N. 



(2.12) 
(2.13) 



The function g : [0, od) — > R is given by g{s) ~ -fj:^^2yn fo^' > 0; and bounded by 
g{0) = G > 0. This model describes the emerging of consensus in a group of interacting 
agents, trying to align (also in terms of abstract consensus) with their neighbors. One 
of the motivations of the model from Cucker and Smale was to describe the formation 
and evolution of languages [531 Section 6] , although, due to its simplicity, it has been 
eventually related mainly to the description of the emergence of flocking in groups of 
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birds 122] • In the latter case, in fact, spatial and velocity coordinates are sufficient to 
describe a pointlike agent (d = 3 + 3), while for the evolution of languages, one would 
have to take into account a much broader dictionary of parameters, hence a higher 
dimension d ^ 3 + 3 of parameters, which is in fact the case of our interest in the 
present paper. 

Let us show that the model is indeed of the type We interprctc the system 

as a group of 27V agents in R'', whose dynamics is given by the following equations 

N 
N 

with f^^iVx) := E-9(ll^^ " ^^H^?)' /^(^^) ^= J^aiW^^ " ^^H^^)' 

fe=i 

for i ^ j. The condition (|1.2p is empty, (|1.3p reads 

L' > max(l,2G) > max 1 1, | - x^W,.)^ . 

Finally, 

2 ^ I 

911 II ^ 

< max^lp . Y,\\\x- - x]\\,. Wy- - y-\\,. 

<2||.g||Lip-||W-I?a:;"|l,«(,«) 

shows that L" < 2||g||Lip- The boundedncss of the trajectories in the phase-space of 
(|2.12p - (|2.13p at finite time has been proved, for instance, in [37], see also [iHl Theorem 
4.6]. The boundedncss at finite time is clearly sufficient to define the constant a 
appearing in Theorem 12. 2[ also because we are mainly interested in the dynamics 
for short time, due to the error propagation. Of course the constant a might grow 
with time, but, for instance, for the Cucker-Smale system it grows at most linearly 
in time [14j : as in the error estimate (|2.1ip we have an exponential function in time 
appearing, the possible linear growth can be considered a negligible issue; moreover, 
as our numerical experiments show, see Section 13. 5[ the situation is much better 
in practice, and suitable scaling, as indicated below, allows us to assume in several 
circumstances that the constant a is uniformly bounded for all times. In fact, even 
when we were interested in longer time or even asymptotical behavior, especially when 
pattern formation is expected, then we would observe the following additional facts: 
In the Cuckcr-Smalc model the center of mass and the mean velocity are invariants 
of the dynamics. Moreover the rate of communication between particles is given by 
9i^) ~ {i+s^)f • When /? < 1/2 it is know (see [14]) that the dynamics will converge 
to a flocking configuration. In this case one can translate at the very beginning the 
center of mass and the mean velocity to 0, and the system will keep bounded for all 
times. Hence in this case the constant a can also be considered uniform for all times 
(not only bounded at finite time). 
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2.4. Least-squares estimate of the error for the Cucker-Smale model. 

The formula (|2.11[) provides the estimate of the maximum of the individual errors, i.e., 
£" := 11(2/" — In this section we address the stronger £2 (^2)"6stimate 
for the error. For generic dynamical systems (jl.ip such estimate is not available in 
general, and one has to perform a case-by-case analysis. As a typical example of 
how to proceed, we restrict ourselves to the Cucker-Smale model, just recalled in the 
previous section. The forward Euler discretization of (|2.12p - (|2.13p is given by 



,,n + l 



(2.14) 



with initial data and given. Let M be again a suitable random matrix in the 
sense of Lemma l2.1l The Euler method of the projected system is given by the initial 
conditions = Mx'^ and = A/?j,° and the formulas 



(2.15) 



N 



^E-9(iiyr-y"ii.,0K-o- 



We are interested in the estimates of the following quantities 



Mv]' 



CI: 



N 



\ 



N 



N 



|«-M<)f^ill,«(,. 



TV 



(2.16) 



(2.17) 



Theorem 2.3. Let the sequences {a;f }, {wf}, {yf}, {e",i} and {e" J, i = 

1,...,N and n = l,...,no be given by (plU) . (p7I^ . ((2ll)) and 0J7]\ . respectively. 
Let e > and let us assume, that the matrix M satisfies 

(1 ~ e)|j< - x]h. < \\Mx: ~ Mx]\\,. < (1 + e)\\x^ - x]\\,. and 

(1 - e)\K - v^Wii < - ^'Hh ^ (1 + - ""ll^f 

for all i, j ~ 1, . . . , N and n = 0, . . . , riQ. 

Then the error quantities f " and £y introduced in (j2.16p and (|2.17p satisfy 



^{£^f + (£„")2 < £(1 + e)hn\\g\\upVXcMhn\\A\\), 



where V := max- 



A 



1 

2(l+e)||5llLip^ 2G^ 



Proof Using (|2.14p and p.lSp . we obtain 

el't' < e" , + he: , and C*"' < C + /^C- 



(2.18) 
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To bound the quantity wc have to work more. We add and subtract the term 
giWyf - V'jWu^iMv]^ - Mvf) and apply (pji)) and I^JQ. This leads to 

h ^ 

-5(||xr-x7||,.)(Aft;;'-A/<)||,. 
h ^ 

^ + 1^ E-9(iiyr - yl\k)i<,3 + <^) (2-19) 



N 



i=i 

We estimate the first summand in (|2.19p 

, N ] C ^ hC ^ 

^ Effdl^^r - y;il.,0(e", + <.) < ^ [iVe^:,, + ^ el^] ^ hGel, + ^ E 
and its ^2-norm with respect to i by Holder's inequality 

n' 



/iV^GC + ^(E(E<.) 1 <2/ix/iVG£;. (2.20) 



To estimate the second summand in p.lOp we make use of 

|ll^r-^"ll.f-||yr-y"ll.,^| 

< |ll< - 411^;; - - Mx1\U\ + IllAf^r - Mx"\U - ll^r - y"\\ 



We arrive at 

N 

llffllLip 

TV 



^^^'^^"^""'^ E II"" - - x-\\,i + + el,) 

< (^+--yi^-p^ (.f li.r - 411..^ + + E <. 



j=i i=i 



The €2-norm of this expression with respect to i is bounded by 



(E(Eii-"--"ii^.^) ) +^(E««:)') +^E4'. 

I z^i j;'— 1 i— 1 1 



N I 

< (1 + £)/i||5||LipV^ViV(eX + 2£^). (2.21) 
Combining (|2.19p with (|2.20p and (|2.2ip leads to the recursive estimate 

< s: + hs:, (2.22) 

< C + + h{l + e)||.g||Lip'^^ {eX + 28^} , 
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which we put into the matrix form 



Sh') - (I) + ((1 + e)eh\\g\\^,,V^ ' ^'^^^'^ 



where is a 2 x 2 matrix given by 
A! ^Id + hA : 
Taking the norms on both sides of (j2.23p leads to 



l) +''(2(l + £)||g||Lipy 2G 



+ < (1 + h\\A\\W{E^Y + {8^Y+e{\ + e)h\\gU,^VX, 

which gives the least-squares error estimate (|2.18p . □ 

3. Dimensionality reduction for continuous dynamical systems. 

3.1. Uniform estimates for continuous dynamical systems. In this section 
we shall establish the analogue of the above results for the continuous time setting of 
dynamical systems of the type (jl.ip . 

N 

= /,(2?x)+^/,;,(27x)x,, i = l,...,N, (3.1) 

a;i(0)=x°, i = l,...,N. (3.2) 

We adopt again the assumptions about Lipschitz continuity and boundedncss of the 
right-hand side made in Section [2j namely (|1.2p . (|1.3p and (|1.4I) . 

Theorem 3.1. Let x(t) G K'^^^, t e [0,r], be the solution of the system ([XT]) - 
p.2p with fi 's and fij 's satisfying (|1.2p - (jl.4p . such that 

max max\\xi(t) — Xj(t)\\pd <a. (3-3) 
te[o,T] i,j 2 

Let us fix k G N, k < d, and a matrix M G R*''^'' such that 

(1 - e)\\x4t) - x,{t)\\,. < \\Mx,it) - Mx,it)\\,. < (1 + e)\\x,it) - x,it)\\,. ,(3.4) 

for all t e [0, T] and i, j = I,..., N. Let y{t) G R''''^, t G [0, T] be the solution of 
the projected system 

N 

= Mf^iV'y) + ^ /„(I?'2/)% , i = l,...,N, 

y,(0) = Afa:^ i = l,...,iV, (3.5) 
such that for a suitable /3 > 0, 

max ||j/(i)|L„,„d-, < /3. (3.6) 

tG[0,T] ""=ooi«2J 

Let us define the column-wise £2-error ei{t) := \\yi — AfxiH^i- for i = 1, . . . , N and 
£{t) := jnax^ei(t) = \\y ~ A/x-||^„^^fc^ . 
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Then we have the estimate 

8{t) < eat{L \\M\\ + L" 13) exp [(2L ||A/|| + 2/?L" + L')t\ 



(3.7) 



Proof. Due to (|1.2p - (|1.4p . we have for every i = 1, . . . , TV the estimate 



dt 



{y,- Mx,,f^{y, - Mxj)) 
lly, - Mx^\\i,u 



< 



— iy,-Mx,) 



N 



< 



WMMV'y) - Mf,iVx)\\,. + \\hi'D'y)yj - hj{Vx)Mx,\\,. 



N 



< 



L \\M\\ \\V'y - + J2 {\\nj{'Dx){Mx, - y,)\\,. + m.iVx) - f,,{V'y))y,\\,. 



< L \\M\\ WV'y - 'Dx\\^N(^iN) + L' \\Mx - J/||fN(ffc) + L" j|2?x - WvWi'^ii'^) 

The term \\V'y -VxWgN (^^n) < WV'y - V Mx\\^m i^^n ) + WV Mx ~ VxW^m (^^n ) is esti- 
mated by 



\V'y-VMx\\ 



max 



< max \\yi — Mxi\\(k + \\yj — Mxj\\ik < 2£{t) , 



and, using the assmnption p.4p . 



\V'Mx -Vx\\ 



max 



\\Mx^- MxjWgk - \\xi-Xj\\t 



< £ max x i — X 



e||2?a;|| 



Finally, by the a priori estimate p.3p for HPxH^jv (^n 
obtain 



and 



for we 



dt 



ei<L \\AI\\ {2£{t) + ea) + L'£(t) + L" (3{2£{t) + ea) 
= {2L \\M\\ + 2PL" + L')£{t) + ea{L \\M\\ + L" p) . 



Now, let us split the interval [0,r) into a union of finite disjoint intervals Ij = 
[tj-i,tj), j = 1,...,K for a suitable K £ N, such that £{t) = ei(^j){t) for t E Ij. 
Consequently, on every we have 

(t) = ^e,(,)(i) < {2L \\M\\ + 2/3i" + L')£{t) + ea{L + , 
and the Gronwall lemma yields 

£{t) < [ea{L \\M\\ + L" P){t - tj^i) + £{tj^i)\ exp {{2L \\M\\ + 2/3L" + L'){t - tj_i)) 

for t € [tj-i,tj). A concatenation of these estimates over the intervals Ij leads finally 
to the expected error estimate 

£{t) < eat{L \\M\\ + L" f3) exp [{2L \\M\\ + 2/3L" + L')t] . 



□ 
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3.2. A continuous Johnson-Lindenstrauss Lemma. Let us now go through 
the assumptions we made in the formulation of Theorem 13 . 1 1 and discuss how they re- 
strict the validity and applicability of the result. First of all, let us mention that p.3p 
and (j3.6[) can be easily proven to hold for locally Lipschitz right-hand sides fi and 
fij on finite time intervals. Obviously, the critical point for the applicability of The- 
orem [33] is the question how to find a matrix M satisfying the condition p.4[) . i.e., 
being a quasi-isometry along the trajectory solution x{t) for every t g [0,r]. The an- 
swer is provided by the following generalization of the Johnson-Lindenstrauss Lemma 
(Lemma 12. ip for rectifiable C^-curves, by a suitable continuity argument. Let us 
stress that our approach resembles the "sampling and e-net" argument in [3l IH [59] for 
the extension of the quasi-isometry property of Johnson-Lindenstrauss embeddings to 
smooth Riemmanian manifolds. From this point of view the following result can be 
viewed as a specification of the work [U [SH] . 
We first prove an auxiliary technical result: 

Lemma 3.2. Let Q < e < e' < 1, a e and let M : R'' M'^' be a linear 
mapping such that 

(l-£)||a||,. < ||A/a||,. < (l + £)||a||,.. 

Let X eM.'^ satisfy 

is' — e)||a|Ld 

\\a~x\\ < , „ (3.8) 
" " - \\M\\ + l+e' ^ ' 

Then 

{l-e')\\x\\,. < \\Mx\\,. < (l + £')ll^ll.- (3.9) 



Proof. If a = 0, the statement is trivial. If a 7^ 0, we denote the right-hand side 
of (|3.8|) by r > and estimate by the triangle inequality 

\\Mx\\,. _ \\M{x-a) + Ma\\,. ^ \\M\\ ■ ||x - a||,. + (1 + £)||aH,. 



||a:||^d \\x-a + a\\id - - 

|jM|j •r + (l+e)||a||^d 
< - — „ „ < 1 + e' . 

A similar chain of inequalities holds for the estimate from below. □ 
Now we are ready to establish a continuous version of Lemma 12.11 
Theorem 3.3. Let (p : [0, 1] ~^R'^ be a curve. Let < e < e' < 1, 

WiOWii r 7 

7 := max - — ^ < 00 and M > (\'d + 2)- — . 

MOWii ~ ' e'-e 

Let k be such that a randomly chosen (and properly normalized) projector M satisfies 
the statement of the Johnson-Lindenstrauss Lemma \2.1\ with e,d,k and M arbitrary 
points with high probability. Without loss of generality we assume that \\M\\ < \fdjk 
within the same probability (this is in fact the case, e.g., for the examples of Gaussian 
and Bernoulli random matrices reported in Section\^. 
Then 

(1 - e')Mt)\\,. < WM^m,. <{1 + e')Mt)\\,., for all t e [0, 1] (3.10) 
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holds with the same probability. 

Proof. Let ti = i/M, i — Q, . . . ,Af and put 

:= arg max^gfj^ t^^^]||(p'(^)||^d, i = 0, . . . , A/" - 1. 

Let M : R'' M'^' be the randomly chosen and normahzed projector (see Lemma [2TT|) . 
Hence ||M|| < i/d/^ and 

(1 - e')Mm,. < ||M(^(T,))||,. < (1 + e')MT,)\\,., i = l,...,N (3.11) 

with high probabihty. We show that p.lOp holds with (at least) the same probability. 

This follows easily from p. lip and the following estimate, which holds for every 
t G [ti, ti+i], 

ll^(i) - (p(r,)|l^d < 11^ {s)\\iids < — < — 2) — 

^ MT,)\\,.{e' ~ e) ^ MT,)\\,.{e' ~ e) 
Vd + 2 - p/|| + l + £' 

The proof is then finished by a straightforward application of Lemma 13.21 □ 
Remark 1. We show now that the condition 

7 := max „ ,^.,| - < oo 

is necessary, hence it is a restriction to the type of curves one can quasi-isometrically 
project. Let d > 3. It is known that there is a continuous curve ip : [0, 1] — >■ [0, 1]''""'^, 
such that (p{[0,l]) ~ [0,1]''"^, i.e., (p goes onto [0,1]''"^. The construction of such a 
space-filling curve goes back to Peano and Hilbert. After a composition with suitable 
dilations and d-dimensional spherical coordinates we observe that there is also a sur- 
jective continuous curve (p : [0,1] — > E>'^~^, where denotes the £2 unit sphere in 

As M was supposed to be a projection, p.lOp cannot hold for all t's with ip{t) € 
ker M 

Obviously, the key condition for applicability of Theorem 13.31 for finding a pro- 
jection matrix M satisfying p.4p is that 



1 1 X'i X jy\^ £d 

sup max-r < 7 < 00 . (3.12) 

te[o,T] *J \\Xi~Xj\\(d 

This condition is, for instance, trivially satisfied when the right-hand sides /i's and 
/ij's have the following Lipschitz continuity: 

Wf^iVx) ^ f,{Vx)\\,. <L"'\\x,- Xj\\i. foraU^,J = l,...,7V, 
\f^A'^^) ~ hA'Dx)] < L""\\x, - a-, II,. for alH, J, fc - 1, . . . , Af. 

We will show in the examples below how condition p.l2p is verified in cases of dynami- 
cal systems modeling standard social mechanisms of attraction, repulsion, aggregation 
and alignment. 
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3.3. Applicability to relevant examples of dynamical systems describ- 
ing social dynamics. In this section we show the applicability of our dimensionality 
reduction theory to well-known dynamical systems driven by "social forces" of align- 
ment, attraction, repulsion and aggregation. Although these models were proposed as 
descriptions of group motion in physical space, the fundamental social effects can be 
considered as building blocks in the more abstract context of many-parameter social 
dynamics. It has been shown |14( I50| that these models are able to produce meaning- 
ful patterns, for instance mills in two spatial dimensions (see Figure [XT]) . reproducing 
the behavior of certain biological species. However, we should expect that in higher 




/ 



Fig. 3.1. Mills in nature and in models 



dimension the possible patterns produced by the combination of fundamental effects 
can be much more complex. 

3.3.1. The Cucker-Smale system (alignment efTect). As shown in Sec- 
tion[2J the Cucker and Smale flocking model (|2.12p - (|2.13p is of the type (jl.ip . satisfies 
the Lipschitz continuity assumptions (|1.2p - (|1.4p . and it is bounded at finite time, as 
already discussed in Section 12.31 Therefore, to meet all the assumptions of Theo- 
rem [O] we only need to check that it also satisfies the condition p.l2p . However, 
for this we need to consider a slightly different framework than in Section [231 instead 
of considering the 2A'' d-dimensional variables {N position variables and N velocity 
variables), we need to arrange the model as A'' variables in R^'^, each variable consist- 
ing of the position part (first d entries) and of the velocity part (the other d entries) . 
We have then 



- ijWii + hi - vjWiii < h - vj 



< \\v, - V, 



< \\v, - f, 



< \\vi - Vj 

for a suitable constant c depending 
boundedncss of the term I 



1 ^ 



k=l 



'''■Y^lhi^ ^kWli ~ \\Xj - Xk\\id\\\vk\\i,d 



N 



\\9\\up 
N 



' N \ 

"^hkWei ] hi-xjWid 



\gd + c\xi — ll^d , 

on the initial data. We used here the a-priori 
Wfcll^d ), see [23] or [38] for details. Consequently, 
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we can satisfy (j3.12p with 7 — max(l,c). 

3.3.2. Second order dynamic model with self-propulsion and pairwise 
interactions (self-drive, attraction, and repulsion effects). Another practi- 
cally relevant model which fits into the class given by (|l.ip is a second order dynamic 
model with self-propulsion and pairwise interactions, |45[ I50| : 

= Vi , (3.13) 

v, = {a~b\\v,\\^^,)v,~j^J2\/,,U{\\x,-Xj\\^,), i^l,...,N, (3.14) 

where a and b are positive constants and U : [0, 00) — > M is a smooth potential. We 
denote u{s) = U'{s)/s and assume that u is a bounded, Lipschitz continuous function. 
We again arrange the model as a system of N variables in M^'^, each variable consisting 
of the position part (first d entries) and of the velocity part (the other d entries). 
Consequently, the model can be put into a form compliant with (|l.ip as follows: 

N 

N N 



with f[f = S,j, f^^iVx) = -jf Y.j^i u{\\x^ - Xj\\g^) and f^j'iVx) = jfU{\\xi - Xj\\i^) 
for i ^ j. Moreover, we may set /™(I'w) = Sij{a—b\\vi\\'ja) by introducing an auxiliary, 

noninfluential constant zero particle {xo,vo) ~ (0,0) with null dynamics, i.e., /q* — 
and /q* = 0, where *,* g {x,v}. Then, (|1.2p is void, while (|1.3p is satisfied by 

m^xY,{\f^;{Vx,Vv)\ + \f^;{Vx,Vv)\ + \f^f {Vx,Vv)\) 
j 

< l + a + bmax\\vi\\^^i +2||u||^^ < L' , 

since the theory provides an apriori bound on f3y := supjgjQ -^j max^ Ht'illfd, see |50| . 
Condition (|1.4[) for /[^^ is void, while for /™ it is satisfied by 

max^ |/™(I?«) - f^f{Vw)\ < 6max \\v,\f,, - \\w,f,. 



3 



< 6max(^|ii;i||^d + \\wi\\idj \\v, - w,\\gi 

< L" WVv - 'Dw\\^N(^iN^ , 

where we again use the apriori boundedness of /?„. For /^^^ is (|1.4p satisfied by 
^ 2 ^ I 



< 2|lM!lLipll^'2;-2?y||^«(^„; 



PARTICLE AND KINETIC MODELING IN HIGH DIMENSION 



17 



Finally, it can be easily checked that condition p.l2p is satisfied by 

-ijllfd + -Vj\\g^ < {I + a + 3bl3l)\\v, - Ujll^d + (\\u\\^^ + 2/3^ IMup) W^^ ~ ^oWii > 

where fix sup^^ jq -^j max^ ||xi||^d. We notice that also this model is bounded at 
finite time as shown in jl31 Theorem 3.10 (formula (22))], and therefore for any 
fixed horizon time T, there is a constant a = a{T) > such that (|2.9p and p.3p 
hold. In the paper |50j it is shown that this model tends to produce patterns of 
different quality, in particular mills, double mills, and translating crystalline flocks 
(see also Figure [5TT|) . These patterns were further studied in [15]. Starting from the 
Liouville equation for the many-body problem the authors derive the corresponding 
kinetic and macroscopic hydrodynamic equations. The kinetic theory approach leads 
to the identification of macroscopic structures otherwise not recognized as solutions 
of the hydrodynamic equations, such as double mills of two superimposed flows. The 
authors found conditions allowing for the existence of such solutions and compared 
them to the case of single mills. In [17] the authors utilize the methods of classical 
statistical mechanics to connect the individual-based models of the type p.l3p - p.l4p 
to their continuum formulations and determine criteria for the validity of the latter. 
They show that H-stability of the interaction potential plays a fundamental role in 
determining both the validity of the continuum approximation and the nature of 
the aggregation state transitions. They perform a linear stability analysis of the 
continuum model and compare the results to the simulations of the individual-based 
one. 

Without entering into further details, let us stress that mills and double mills are 
uniformly bounded in time (and stable). Hence in these cases, we can assume that 
actually the constant a is again bounded for all times. Moreover, when the dynamics 
converges to a translating crystalline flocks, we may reason in a similar way as done 
for the Cucker-Smale model (although in this case the pattern in unstable). 

3.4. Recovery of the dynamics in high dimension from multiple simula- 
tions in low dimension. The main message of Theorem 13. II is that, under suitable 
assumptions on the governing functions fi, fij, the trajectory of the solution y{t) of the 
projected dynamical system p.Sp is at an e error from the trajectory of the projection 
of the solution x{t) of the dynamical system (|3.ip - p.2[) . i.e., 

y,(t) w Mx,(t) or, more precisely, \\Mxi{t) - y^(t)||^fc < C(t)e, t e [0,T]. (3.15) 

We wonder whether this approximation property can allow us to "learn" proper- 
ties of the original trajectory x{t) in high dimension. 

3.4.1. Optimal information recovery of high-dimensional trajectory from 
low-dimensional projections. In this section we would like to address the following 
two fundamental questions: 

(i) Can we quantify the best possible information of the high-dimensional tra- 
jectory one can recover from one or more projections in lower dimension? 

(ii) Is there any practical method which performs an optimal recovery? 

The first question was implicitly addressed already in the 70 's by Kashin and later 
by Garnaev and Gluskin [41) I32| , as one can put in relationship the optimal recovery 
from linear measurements with Gelfand width of £p-balls, see for instance [18]. It was 
only with the development of the theory of compressed sensing |12[ 127) that an answer 
to the second question was provided, showing that £i-minimization actually performs 
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an optimal recovery of vectors in high dimension from random hnear projections to 
low dimension. We address the reader to [311 Section 3.9] for further details. In 
the following wc concisely recall the theory of compressed sensing and we apply it to 
estimate the optimal information error in recovering the trajectories in high dimension 
from lower dimensional simulations. 

Again a central role here is played by (random) matrices with the so-called Re- 
stricted Isometry Property RIP, cf. |llj . 

Definition 3.4 (Restricted Isometry Property). A k x d matrix M is said to 
have the Restricted Isometry Property of order K < d and level S e (0, 1) if 

(1 - 5)||a;||| < ||Ma;||| < (1 + 5)||x||| 

for all K -sparse x e 12k ^ {z £ R'^ : #supp (z) < K}. 

Both the typical matrices used in Johnson-Lindenstrauss embeddings (cf. Lemma 
12. ip and matrices with RIP used in compressed sensing are usually generated at 
random. It was observed by [3] and j44| , that there is an intimate connection between 
these two notions. A simple reformulation of the arguments of [3] yields the following. 

Theorem 3.5 (Baraniuk, Davenport, DeVore, and Wakin). Let M be a k x d 
matrix drawn at random which satisfies 

(1 - S/2)\\x\\% < llA/xlll < (1 + 5/2)|lx|||, xer 

for every set V C with j^V < (^)^ with probability < ly < 1. Then M 
satisfies the Restricted Isometry Property of order K and level S/3 with probability at 
least equal to v. 

Combined with several rather elementary constructions of Johnson-Lindenstrauss 
embedding matrices available in literature, cf . [1] and |25j , this result provides a simple 
construction of RIP matrices. The converse direction, namely the way from RIP 
matrices to matrices suitable for Johnson-Lindenstrauss embedding was discovered 
only recently in [44] . 

Theorem 3.6 (Krahmer and Ward). Fix 77 > and e > 0, and consider a finite 
set V C R"^ of cardinality \V\ = TV. Set K > 40 log and suppose that the k x d 

matrix M satisfies the Restricted Isometry Property of order K and level 5 < e/4. 
Let ^ £ M."^ be a Rademacher sequence, i.e., uniformly distributed on {—1, l}'^ . Then 
with probability exceeding 1 — 77, 

(l-£)i|.i:||2, < ||Mx||| < (l + e)||a;||2,. 

uniformly for all x £ V , where M M diag(^), where diag(^) is a d x d diagonal 
matrix with ^ on the diagonal. 

We refer to [5T] for additional details. 

Remark 2. Notice that M as constructed in Theorem \ 3.6] is both a Johnson- 
Lindenstrauss embedding and a matrix with RIP, because 

(1 - S)\\x\\l, - (1 - (5)11 diag(Ox||| < II Mdiag(e)x||| 

■.=M 

< (1 + J)||diag(e)x||| = (l + <5)||.x|||. 

The matrices considered in Section [H satisfy with high probability the RIP with 

k 



K = 



1 + \og{dlk) 
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Equipped with the notion of RIP matrices we may state the main resuh of the 
theory of compressed sensing, as appearing in f28| . which we shah use for the recovery 
of the dynamical system in Mf^. 

Theorem 3.7. Assume that the matrix M e E''^'' has the RIP of order 2K and 
level 



62K < ^ « 0.4627. 




Then the following holds for all x £ M''. Let the low- dimensional approximation 
y = Mx + rj be given with \\ri\\gk < Ce. Let x"^ he the solution of 



mm 



llzjl^d subject to \\Mz — uW^k <\\rj\\^k. (3.16) 



aK{x)i 

fd < Cl£ + C2 ' 



Then 



K 

for some constants Ci,C2 > that depend only on52K, and aK{x)id — inf^.^gupp (2)<^ \\z- 
X^£d IS the best-K -term approximation error in £f. 

This result says that provided the stability relationship p.lSp . we can approximate 
the individual trajectories Xi{t), for each t S [0,T] fixed, by a vector xf{t) solution 
of an optimization problem of the type (j3.16p , and the accuracy of the approximation 
depends on the best-A'-term approximation error aK{xi{t))gd. Actually, the results 
in [121 HZ] in connection with [THl STJ , state also that this is asymptotically the 
best one can hope for. One possibility to improve the recovery error is to increase 
the dimension k (leading to a smaller distortion parameter e > in the Johnson- 
Lindenstrauss embedding). But we would like to explore another possibility, namely 
projecting and simulating in parallel and independently the dynamical system L-times 
in the lower dimension k 

N 

yl = M'f,{V'/)+Y,hA'D'/)yl yf(0) = A/^x°, €=1,...,L. (3.17) 

Let us give a brief overview of the corresponding error estimates. The number of 
points needed in each of the cases \s N ^ N x n^, where N is the number of agents 
and no = T /h is the number of iterations. 

• We perform 1 projection and simulation in M'^': Then e = C>{^^J'^^-^ , K = 
O ( i+\ogi^d/k) ) ^-"^"^ application of Theorem 13 . 71 leads to 



ii.,w-f«)iu. <c'(«) ^ (3,18) 



Here, C'{t) combines both the constants from Theorem 13.71 and the time- 
dependent C{t) from (|3.15p . So, to reach the precision of order C'{t)e > 0, we 

have to choose k gN large enough, such that y < e and — - < e. 

We then need k x iV^ operations to evaluate the adjacency matrix. 
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Wc perform 1 projection and simulation in 



pLxk. 



Then e' = O 



logA^ 
Lk 



and 



K' = O 



Lk 

l+log(d/Lfc) 



and an application of Theorem 13.71 leads to 



\\x.it)~xfit)\U<C'it) 



/log TV (TK'{Xt(t))f,^ 



Lk 



/K' 



(3.19) 



The given precision of order C'{t)e > 0, may be then reached by choosing 

k,L eN large enough, such that yf^^ < e and ^ < e. We then 

need Lk x A^^ operations to evaluate the adjacency matrix. 

We perform L independent and parallel projections and simulations in M'^': 



Then we assemble the following system corresponding to p.l7p 



Mx 



( \ 

M2 



( v\ \ 



\ yt J 



where for all 



1, . . . ,L the matrices £ 



pkxd 



are (let us say) ran- 



dom matrices with each entry generated independently with respect to the 
properly normalized Gaussian distribution as described in Section [21 Then 
m/Vl is a Lk X d matrix with Restricted Isometry Property of order K' = 

O ( i+io^(d/Lfc) ) S < 0.4627. The initial distortion of each of the 



projections is still e = O 
can compute xf{t) such that 



log TV 
k 



Therefore, by applying Theorem 13.71 we 



\Mt)-xfm,.<c'it) 



/log TV aK'{x^{t))^,a 



(3.20) 



Notice that the computation of xf{t) can also be performed in parallel, see, 
e.g., |29j . The larger is the number L of projections we perform, the larger 
is K' and the smaller is the second summand in (j3.20p : actually a^' {xi{t))gd 
vanishes for K' > d. Unfortunately, the parallelization can not help to reduce 
the initial distortion e > 0. To reach again the precision of order C'{t)e > 0, 

we have to choose fc G N large enough, such that 
chose L > 1 large enough such that 



logAf 
k 



/K' 



< e. Then we 
< e. We again need k x N-^ 



operations to evaluate the adjacency matrix. 
In all three cases, we obtain the estimate 



\x^it)~xtit)hd<C'it) 



crK{Xiit))fd 



(3.21) 



where the corresponding values of e > and K together with the number of operations 
needed to evaluate the adjacency matrix may be found in the following table. 
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number of operations 



1 projection into 



O 



logA^ 
fc 



o 



l+log(d/fe) 



1 projection into 



nLxfc 



o 



logAA 
Lk 



' i+iog(d/Lfc; 



Lfc X 



L projections into 



O 



log AT 
k 



o 



Lk 

l+\os(d/Lk] 



kx 



3.4.2. Optimal recovery of trajectories on smooth manifolds. In recent 
papers [H I59( I39j . the concepts of compressed sensing and optimal recovery were 
extended to vectors on smooth manifolds. These methods could become very useful in 
our context if (for any reason) we would have an apriori knowledge that the trajectories 
Xi {t) keep staying on or near such a smooth manifold. Actually this is the case, for 
instance in molecular dynamics, where simulations, e.g. in the form of the coordinates 
of the atoms in a molecule as a function of time, lie on or near an intrinsically-low- 
dimensional set in the high-dimensional state space of the molecule, and geometric 
properties of such sets provide important information about the dynamics, or about 
how to build low-dimensional representations of such dynamics |52[ I58j . In this case, 
by using appropriate recovery methods as described in [39], we could recover high- 
dimensional vectors from very low dimensional projections with much higher accuracy. 
However, this issue will be addressed in a following paper. 

3.5. Numerical experiments. In this section we illustrate the practical use 
and performances of our projection method for the Cucker-Smale system (|2.12p - (|2.13p . 

3.5.1. Pattern formation detection in high dimension from lower di- 
mensional projections. As already mentioned, this system models the emergence 
of consensus in a group of interacting agents, trying to align with their neighbors. The 
qualitative behavior of its solutions is formulated by this well known result [^[^155] : 

Theorem 3.8. Let {xi(t),Vi(t)) be the solutions of (|2.12p - (|2.13p . Let us define 
the fluctuation of positions around the center of mass Xc{t) ~ j^'^iLi^iit) , and, 
resp., the fluctuation of the rate of change around its average Vc{t) = jj X^i^i '^^ 



Mt) 



1 " 



Xc{t)\\% , 



1 ^ 



i=i 



Then if either 13 < 1/2 or the initial fluctuations A(0) and T(0) are small enough 
(see for details), then T{t) —> as t —> oo. 

The phenomenon of V{t) tending to zero as t — )■ oo is called flocking or emergence 
of consensus. If /? > 1/2 and the initial fluctuations are not small, it is not known 
whether a given initial configuration will actually lead to flocking or not, and the only 
way to find out the possible formation of consensus patterns is to perform numerical 
simulations. However, these can be especially costly if the number of agents N and 
the dimension d are large; the algorithmic complexity of the calculation is 0{dx N'^). 
Therefore, a significant reduction of the dimension d, which can be achieved by our 
projection method, would lead to a corresponding reduction of the computational 
cost. 

We illustrate this fact by a numerical experiment, where we choose = 1000 
and d = 200, i.e., every agent i is determined by a 200-dimensional vector Xi of its 
state and a 200-dimensional vector Vi giving the rate of change of its state. The 
initial datum (x^,v^) is generated randomly, every component of x^ being drawn 
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k=100 k=10 




200 150 100 50 25 10 5 2 200 150 100 50 25 10 5 

dimension dimension 



Fig. 3.2. Numerical results for /3 = 1.5; First row shows the evolution of r(t) of the system 
projected to dimension k = 100 (left) and fc = 10 (right) in the twenty realizations, compared to the 
original system (bold dashed line). Second row shows the initial values r(t = 0) and final values 
r(t = 30) in all the performed simulations. 



independently from the uniform distribution on [0, 1] and every component of v*^ being 
drawn independently from the uniform distribution on [—1,1]. We choose f3 — 1.5, 
1.62 and 1.7, and with every of these values we perform the following set of simulations: 

1. Simulation of the original system in 200 dimensions. 

2. Simulations in lower dimensions k: the initial condition (a;°,w'^) is projected 
into the fc-dimensional space with a random Johnson-Lindenstrauss projection 
matrix M with Gaussian entries. The dimension k takes the values 150, 100, 
50, 25, 10, 5, and 2. For every k, we perform the simulation twenty times, 
each time with a new random projection matrix M . 

All the simulations were implemented in MATLAB, using 1500 steps of the forward 
Euler method with time step size 0.02. The paths of r(t) from the twenty experiments 
with k = 100 and fc = 25 or fc = 10 are shown in the first rows of Figs. I3.2[ 13.31 and. 
rcsp.. l5^ for /3 = 1.5, 1.62 and, resp., 1.7. 

The information we are actually interested in is whether flocking takes place, in 
other words, whether the fluctuations of velocities T{t) tend to zero. Typically, after an 
initial phase, the graph of T(t) gives a clear indication either about exponentially fast 
convergence to zero (due to rounding errors, "zero" actually means values of the order 
2^0-30 jj^^ ^Yic simulations) or about convergence to a positive value. However, in certain 
cases the decay may be very slow and a very long simulation of the system would be 
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k=100 k=25 




200 150 100 50 25 10 5 2 200 150 100 50 25 10 5 2 

dimension dimension 



Fig. 3.3. Numerical results for /3 = 1.62; First row shows the evolution ofr{t) of the system 
projected to dimension k = 100 (left) and k = 25 (right) in the twenty realizations, compared to the 
original system (bold dashed line). Second row shows the initial values r(t = 0) and final values 
r(t = 30) in all the performed simulations. 

needed to see if the limiting value is actually zero or not. Therefore, we propose the 
following heuristic rules to decide about flocking from numerical simulations: 

• If the value of F at the final time t = 30 is smaller than 10"^", wc conclude 
that flocking took place. 

• If the value of r(30) is larger than 10"'^, we conclude that flocking did not 
take place. 

• Otherwise, we do not make any conclusion. 

In the second rows of Figs. l3.2[|3.3l and l3.4l we present the initial and final values of F of 
the twenty simulations for all the dimensions k, together with the original dimension 
d ~ 200. In accordance with the above rules, flocking takes place if the final value 
of F lies below the lower dashed line, does not take place if it lies above the upper 
dashed line, otherwise the situation is not conclusive. The results are summarized in 
Table O 

Experience gained with a large amount of numerical experiments shows the fol- 
lowing interesting fact: The flocking behavior of the Cucker-Smale system is very 
stable with respect to the Johnson-Lindenstrauss projections. Usually, the projected 
systems show the same flocking behavior as the original one, even if the dimension is 
reduced dramatically, for instance from d = 200 to fc = 10 (see Figs l3.2l and l3.4p . This 
stability can be roughly explained as follows: Since the flocking behavior depends 
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200 150 100 50 25 10 

dimension 



i f 



I I 



200 150 100 50 25 10 5 2 

dimension 



Fig. 3.4. Numerical results for /3 = 1.7; First row shows the evolution of r(t) of the system 
projected to dimension k = 100 (left) and fc = 10 (right) in the twenty realizations, compared to the 
original system (bold dashed line). Second row shows the initial values r(t = 0) and final values 
r(t = 30) in all the performed simulations. 



mainly on the initial values of F and A, which are statistical properties of the random 
distributions used for the generation of initial data, and since N is sufRciently large, 
the concentration of measure phenomenon takes place. Its effect is that the initial 
values of the fluctuations of the projected data are very close to the original ones, and 
thus the flocking behavior is (typically) the same. There is only a narrow interval of 
values of /? (in our case this interval is located around the value /? — 1.62), which is 
a borderline region between flocking and non-flocking, and the projections to lower 
dimensions spoil the flocking behavior, see Fig l3.3l Let us note that in our simulations 
we were only able to detect cases when flocking took place in the original system, but 
did not take place in some of the projected ones. Interestingly, we never observed the 
inverse situation, a fact which we are not able to explain satisfactorily. In fact, one 
can make other interesting observations, deserving further investigation. For instance. 
Figs. 13.2 1 and 13. 31 show that if the original system exhibits flocking, then the curves of 
r(t) of the projected systems tend to lie above the curve of T{t) of the original one. 
The situation is reversed if the original system does not flock, see Fig. 13.41 

From a practical point of view, we can make the following conclusion: To obtain an 
indication about the flocking behavior of a highly dimensional Cucker-Smale system, 
it is typically satisfactory to perform a limited number of simulations of the system 
projected into a much lower dimension, and evaluate the statistics of their flocking 
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Table 3.1 



Statistics of the flocking behaviors of the systems in the original dimension d = 200 and in 
the projected dimensions. With /3 = 1.5 and fi = 1.62, the original system (d = 200) exhibited 
flocking behavior. With /3 = 1.5, even after random projections into 25 dimensions, the system 
exhibited flocking in all 20 repetitions of the experiment, and still in 14 cases in dimension 10. With 
/3 = 1.62, the deterioration of the flocking behavior with decreasing dimension was much faster, 
and already in dimension 25 the situation was not conclusive. This is related to the fact that the 
value P = 1.62 was chosen to intentionally bring the system close to the borderline between flocking 
and non-flocking. Finally, with jS = 1.7, the original system did not flock, and, remarkably, all the 
projected systems (even to two dimensions) exhibit the same behavior. 



behavior. If the resuh is the same for the majority of simulations, one can conclude 
that the original system very likely has the same flocking behavior as well. 



Relative error of projection of the v-variabies 



Reiative error of reconstruction of the v-variabies 




1.2 - 
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Fig. 3.5. Numerical results showing the time evolution of the relative error of projection (left 
panel) and relative error of recovery via £i-minimization (right panel) of the v-variables. 



3.5.2. Numerical validation of the high and low dimensional approxi- 
mation properties. Finally, we show how the relative error of projection and re- 
covery evolves in time. We consider an initial datum € R^^*^ x M^^'* for 
the Cuckcr-Smalc system with N = d = 200 and randomly generated entries from 
the normal distribution. The parameter (3 ~ 0.4, therefore, the system will exhibit 
flocking. First we project the system into k = 20, 40, 60, 100, 140, 180 dimensions and 
calculate the relative error of the projection of the w-variables, given by 

j:Ijmu,\\% ) 
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Wc observe that the maximal relative error (for k = 20) is around 14%, which we 
consider as a very good result. Moreover, in all 9 cases, the error first increases, but 
after t ~ 22 it starts decreasing, which is a consequence of the flocking behavior and 
concentration of measure, see the graphics in Figurc l375l on the left. This clearly shows 
that the worst-case estimate of Theorem 12 . 2 1 with exponential growth in time is overly 
pessimistic. 

In our second experiment, we take a randomly generated initial condition with 
N = d = 200 and 80% of the entries set to zero. Then, we take L projections of the 
system into 20 dimensions, with L = 1, 2, 3, 5, 7, 9, and reconstruct the u-trajcctorics 
using £i-minimization, as described in Section [3.4.1l In the graphics in Figure |375] on 
the right, we plot the relative errors, given by 



where Vi are the recovered trajectories. Again, we observe that the errors grow much 
slower than exponentially, and after < ~ 15 they even tend to stay constant or slightly 
decrease. 

4. Mean-field limit and kinetic equations in high dimension. In the pre- 
vious sections we were concerned with tractable simulation of the dynamical systems 
of the type (|1.1[) when the dimension d of the parameter space is large. Another source 
of possible intractability in numerical simulations appears in the situation where the 
number of agents N is very large. In general, large N imposes even a much more se- 
vere limitation than large d, since the computational complexity of (jl.ip is 0{d x N'^). 
Therefore, in the next sections we consider the so-called mean-field limit of (jl.ip as 
— > oo, where the evolution of the system is described by time-dependent probability 
measures iJ.{t) on R'', representing the density distribution of agents, and satisfying 
mesoscopic partial differential equations of the type ()4.1|) . This strategy originated 
from the kinetic theory of gases, see [16] for classical references. We show how our 
projection method can be applied for dimensionality reduction of the corresponding 
kinetic equations and explain how the probability measures can be approximated by 
atomic measures. Using the concepts of delayed curse of dimension and measure 
quantization known from optimal integration problems in high dimension, we show 
that under the assumption that the measure concentrates along low-dimensional sub- 
spaces (and more generally along low-dimensional sets or manifolds), it can be ap- 
proximated by atomic measures with sub-exponential (with respect to d) number of 
atoms. Through such approximation, we shall show that we can approximate suitable 
random averages of the solution of the original partial differential equation in high 
dimension by tractable simulations of corresponding solutions of lower-dimensional 
kinetic equations. 

Another interesting approach to the problem of efficient numerical simulation of 
large group dynamics is the so-called "equation- free" approach, see e.g. [17] • Here, 
convenient coarse-grained variables that account for rapidly developing correlations 
during initial transients are chosen, in order to perform efficient computations of 
coarse-grained steady states and their bifurcation analysis. The big advantage of the 
equation-free approach is that the coarse-grained dynamics can be explored without 
the assumption of the continuum limit equation as we consider here. The premise of 
the method is that coarse-grained governing equations conceptually exist, but arc not 
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explicitly available in closed form. The main idea is that short bursts of appropri- 
ately initialized microscopic (fine-scale) simulations and the projection of the results 
onto coarse-grained variables result in time-steppers (mappings) for those variables 
(which is effectively the same as the discretization of the unavailable equations) . One 
then processes the results of the short simulations to estimate various coarse-grained 
quantities (such as time derivatives, action of Jacobians, residuals) to perform rele- 
vant coarse-grained level numerical computations, as if those quantities were obtained 
from coarse-grained governing equations. 

4.1. Formal derivation of mean-field equations. In this section we briefly 
explain how the mean-field limit description corresponding to can be derived. 

This is given, under suitable assumptions on the family of the governing functions 

~ {fii fij '■ h j ~ ■ ■ N}, by the general formula 

1^ + V • (H^[/i]M) - 0, (4.1) 

where 'Hj^lfi] is a field in R'^, determined by the sequence = {J^N)NeN- 

In order to provide an explicit example, we show how to formally derive the mean 
field limit of systems of the type 

Xi=V^, (4.2) 
N N 

- E f^fi1^^^^^v)v, + flTi'Dx)x, , (4.3) 

with 

fl^Vx) = JjY^uiWx, ~ x,\\,.) + ^—^u{\\x, x,\\,.) , 



( 1 

n;\Vx,Vv)^8Ah{\v,,\\\^--Y^g{ 



Note that for suitable choices of the functions h,g,u this formalism includes both the 
Cucker-Smale model (|2.12[) - (|2.13p and the self-propulsion and pairwise interaction 
model (|3.13p - (|3.14p . We define the empirical measure associated to the solutions 
Xi{t), v,{t) of g21)~g^ as 

1 ^ 

Taking a smooth, compactly supported test function ^ S C^{S?''^) and using (|4.2p - 
(|4.3p . one easily obtains by a standard formal calculation (see |14j ) 



(4.4) 

V.A{x,v)-vd^i^{t,x,v)+ / W„^{x,v)■H[^i^{t)]{x,v)A^l^{t,x,v), 



R2d 

with 

'H[y]{x,v) = h{\\v\\ia)v + I g(\\x-y\\ia){w-v)d^i{y,w)+ / u{\\x - y\\iA{y - x) d^l{y,w) . 
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Wc now assume weak convergence of a subsequence of (/i^(t))ArgN to a time-dependent 
measure /i(t) — fi(t, v) and boundedness of its first order moment, which indeed can 
be estabHshed rigorously for the Cucker-Smale and the self-propulsion and pairwise 
interaction systems (see [38], [50]). Then, passing to the limit — >■ c» in (|4.4p . one 
obtains in the strong formulation that fi is governed by 

Ofjj 

— {t,x,v) +v ■ 'Vxf^{t,x,v) + Vi, • {n[iJ.{t)]{x,v)ii{t,x,v)) = 0, 

which is an instance of the general prototype (j4.ip . 

Using the same formal arguments as described above, one can easily derive mean 
field limit equations corresponding to (jl.ip with different choices of the family J-. 

4.2. Monge-Kantorovich-Rubinstein distance and stability. In several 
relevant cases, and specifically for the Cuckcr-Smale and the self-propulsion and pair- 
wise interaction systems [T3], solutions of equations of the type (|4.ip are stable with 
respect to suitable distances. We consider the space 'Pi{R'^), consisting of all proba- 
bility measures on M'' with finite first moment. In 'Pi{M.'^) and for solutions of (|4.ip . a 
natural metric to work with is the so-called Monge-Kantorovich-Rubinstein distance 
(also called Wasserstein distance) [57], 



Wi{n,iy) ~ sup{|(/x- 1^,01 



^(x)d(pi - v){x) 



,eeLip(M^),Lip(0<l}. 



(4.5) 

We further denote Vd^'^) the space of compactly supported probability measures on 
W^. In particular, throughout the rest of this paper, we will assume that for any 
compactly supported measure valued weak solutions n{t),v{t) e C{[0,T],'Pc{R'^)) of 
(|4.ip we have the following stability inequality 

Wli^iit),lyit))<c{t)Wl{^lio),l^io)), te[o,T], (4.6) 

where C{t) is a positive increasing function of t with C(0) > 0, independent of the 
dimension d. We address the interested reader to [TS] Section 4] for a sample of general 
conditions on the vector field H{J^]{ij.) which guarantee stability (|4.6p for solutions of 
equations (|4.1[) . 

4.3. Dimensionality reduction of kinetic equations. Provided a high-dimensional 
measure valued solution to the equation 

|^+V.(H^[M]Ai)=0, ^l{0) = ^IoeVciR''), (4.7) 

we will study the question whether its solution can be approximated by suitable 
projections in lower dimension. 

Given a probability measure fi G 7^i(M'*), its projection into M.'' by means of a 
matrix M : — > M'^ is given by the push-forward measure := M^fi, 

{fiM,(p) f{M-)) for all ip e Lip(M*-'). (4.8) 

Let us mention two explicit and relevant examples: 

• If /i^ — jj X)i3=i is an atomic measure, we have (/^j^, f) = {iJ,^ , tf{M-)) = 



N 

TfY^=iV{Mx,). Therefore, 

N 



^M = ^E'^^^-- (4-9) 

1=1 
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If fi is absolutely continuous with respect to the Lebesgue measure, i.e., it is a 
function in L^(&.'^), the calculation requires a bit more effort: Let us consider 
the pseudo-inverse matrix of M. Recall that = M*{MM*)-'^ is a 
right inverse of M, and M^M is the orthogonal projection onto the range of 
M* . Moreover, x = APMx + £,x, where £,x & kerM for aU x G R"*. According 
to these observations, we write 

f{Mx)fi{x)dx^ [ ip{Mx)fi{M^ Mx + £_x)dx 

ip{Mx)fi{MHlx + £,x)dx 

ranA/*0kor M 

)dv^dv 



ranA/* J kcr M 



Note now that M|,.anAf* ■ ranM* — >■ ranM ^ R is an isomorphism, hence y = 
Mv implies the change of variables dv = dei{M\^^^M')^^dy = det{MM*)~^^^dy. 
Consequently, we have 



(p{Mx)n{x)dx = I ip{Mx)fi{M^ Mx + ^x)dx 

/ ip{Mv)n{M''Mv + v-^)dv-^dv 

lA/* JkcrA/ 

(dS(]v7W^L„''<*'''' + ''')*")''''')*' 



and 



^-^(^) = det(MM-)V^ X„./-("^^^ + "")^-" 

According to the notion of push-forward, we can consider the measure valued function 
ly G C([0,T],7'c(R'')), solution of the equation 

+ V • {n^,, Miy) = 0, 1^(0) = Mm e VciR"), (4.10) 

ot 

where {^io)M = A/#/-io and Tm = {{Mfi, fij,i,j = l,...,N})Nm- As for the 
dynamical system (|3.5p , also equation (|4.10p is fully defined on the lower-dimensional 
space R'^ and depends on the original high-dimensional problem exclusively by means 
of the initial condition. 

The natural question at this point is whether the solution i/ of (|4.10p provides 
information about the solution /i of (|4.7p . In particular, similarly to the result of 
Theorem 13. 11 we will examine whether the approximation 

iy{t) ^ ^iM{t), te[0,T], 

in Monge-Kantorovich-Rubinstein distance is preserved in finite time. We depict the 
expected result by the following diagram: 

m Kt) 

IM i M 

Z^(0) ^ ino)M v{t) W ^lM{t) . 
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This question will be addressed by approximation of the problem by atomic measures 
and by an application of Theorem 13.11 for the corresponding dynamical system, as 
concisely described by 

W'i(M.M")<e AT 

n — > M 

^iM — > V ~ Mm 

Let us now recall the framework and general assumptions for this analysis to be 
performed. We assume again that for all € N the family = {fi,fij : i,j = 
1, . . .N} is composed of functions satisfying (|1.2p - (|1.4p . Moreover, we assume that 
associated to = (J^jv)AreN and to 

N 

±,{t) MVx{t)) + Y,.Uji'Dx{t))x,{t), (4.11) 
we can define a mean-field equation 

|^ + V-(H[J-](m)m) = 0, ^,{0) = ^i,eVc{R'), (4.12) 

such that for any compactly supported measure valued weak solutions fi{t),i>{t) € 
C{[0,T],'Pc{R'^)) of dill) we have the following stability 

Wli^lit),lyit))<cit)Wli^lio),l^io)), te[o,T], (4.13) 

where C(t) is a positive increasing function of t, independent of the dimension d. 
We further require that corresponding assumptions, including stability, hold for the 
projected system ()2.5p and kinetic equation (|4.10p . Then we have the following ap- 
proximation result: 

Theorem 4.1. Let us assume that fio G Vd^'^) and there exist points {ccj, . . . , x^} C 
, for which the atomic measure fiQ = X^i^i ^x" approximates fiQ up to e > in 
Monge-Kantorovich-Rubinstein distance, in the following sense 

Wi{fio,f^o) <s, N ^M^'^^^ fork{e)<d andk{e)^dfore^Q. (4.14) 

Requirement (|4.14p is in fact called the delayed curse of dimension as explained below 
in detail in Section \4-5\ Depending on e > we fix also 



k = k{e) = ©(£-2 log(A^)) = ©(£-2 \og{M)k{e)). 

Moreover, let M : — > M.^ be a linear mapping which is a continuous Johnson- 
Lindenstrauss embedding as in (|3.4p for continuous in time trajectories Xi (t) of (|4.1ip 
with initial datum Xi{0) = x^ . Let v € C([0, T], 7'c(M'^)) be the weak solution of 

■{H[Fm]{^)^)^Q, (4.15) 

v{Q) = {po)m e VaiR''), (4.16) 
where {p-o)m = ^-^#Mo- Then 

WiifiMit),i^{t))<Cit)\\M\\e, t€[0,T], (4.17) 
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where C{t) is an increasing function oft, with C(0) > 0, which is at most polynomially 
growing with the dimension d. 

Proof. Let us define {t) the solution to equation (j4.15p with initial datum 
1/^(0) = {ij,q)m, or, equivalently, thanks to (|4.9p 



where yi(t) is the solution of 



1 = 1 



TV 

y, = f,{V'y) + J2 h {'D'y)Vi > ^ = 1, . . . , iV , 
y,(0) = Mx°, z = l,...,7V. 

We estimate 

w^{^iM{t),v{t))<w,{^lM{t),{^i''{t))M) + w^{{^l^{t))M.l^''{t)) + w^^^^ 

By using the definition of push- forward (|4.8p and (|4.14p . the first term can be esti- 
mated by 

w^iiJLMit), {^J^''{t))M) = sup{(MAfW - {^Ji''{t))M.v) ■■ Lip(^) < 1} 

= sup{(^(t) - ^"^(0,^(1/.)) : LiplV') < 1} 
<\\M\\W,{^i{tl^,''{t))<\\M\\C{t)e. 

We estimate now the second term 

M^i((/x^W)m,;^'^(<)) = sup{((M^(t))M - v''{t),v) : Lip(^) < 1} 

1 ^ 

= "'^P^^ W) - '/'(y. W)) : Lip(<^) < 1} 



1=1 



i=l 

We recall the uniform approximation of Theorem 13. H 

\\Mx,{t) - y,{t)\\,. <D{t)e, i = l,...,N, 

where D{t) is the time-dependent function on the right- hand-side of p.7p . Hence 

Wi{fiM{t)At^^it))M)<D{t)e. 

We address now the upper estimate of the third term, by the assumed stability of the 
lower dimensional equation (j4.10p 

Wi{,^^it),iy{t)) < Cit)Wi{iy''iO),iym 

= C{t)Wi{{fl^)M,{flo)M) 

<C{t)\\M\\W{fi^,fio)<CmM\\e. 

We can fix C{t) ~ 2C(t)||Af || + D{t), and, as observed in Theorem 13.31 we can assume 
without loss of generality that < Hence, C{t) depends at most polynomially 
with respect to the dimension d. □ 
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4.4. Approximation of probability measures by atomic measures and 
optimal integration. In view of the fundamental requirement (|4.14p in Theorem 
14.11 given hq G 'Pc(R'^), we arc interested to estabhsh an upper bound to the best pos- 
sible approximation in Monge-Kantorovich-Rubinstein distance by means of atomic 



measures 



N 



N 



12iLo^ (5^.0 with atoms, i.e. 



inf 

N 1 s:^N — 1 X 



(4.18) 



= inf sup{| / a^)d^,oix) - ^ E ^ ^ Lip(K'):Lip(C) < l} 



In fact, once we identify the optimal points {a 



0' ■ 



'-A'-i}' '^^^ them as initial 



conditions Xi{0) 
relationship (|4.6 



for the dynamical system (|4.1ip . and by using the stability 



we obtain 



Wl{^l{t),^I'^t))<ciT)Wlipo,^^o), te[o,T], 



(4.19) 



where fJ^'^ {t) = jf'^iLo^^xi{t)j meaning that the solution of the partial differential 
equation (|4.ip keeps optimally close to the particle solution of (|4.1ip also for suc- 
cessive time t > 0. Note that estimating (|4.18p as a function of is in fact a very 
classical problem in numerical analysis well-known as optimal integration with its 
high-dimensional behaviour being a relevant subject of the field of Information Based 
Complexity [491155]. 

The numerical integration of Lipschitz functions with respect to the Lebesgue 
measure and the study of its high-dimensional behaviour goes back to Bakhvalov [2], 
but much more is known nowadays. We refer to |33| and |36| for the state of the art 
of quantization of probability distributions. 

The scope of this section is to recall some facets of these estimates and to refor- 
mulate them in terms of Wi and £n- We emphasize that here and in what follows, 
we consider generic compactly supported probability measures ^, not necessarily ab- 
solutely continuous with respect to the Lebesgue measure. We start first by assuming 
d = 1, i.e., we work with a univariate measure /i € 'Pc(lR) with support supp fi C [a, b] 
and a := b — a > 0. We define the points xq, . . . , xn-i as the quantiles of the proba- 
bility measure i.e., xq :— a and 



N 



dtJ.{x), 



,N-l. 



(4.20) 



This is notationally complemented by putting xn 
j^^*^ dfi{x) = j^,i = 0, . . . , N — 1, and we have 



b. Note that by definition 



^{x)d^i{x) - j^Yl ^(^') 



< 



Af-l 

E 

i=0 
Af-1 

1=0 



ia^) - ax^))d^Ji{x) 



\^{x)~^{x,)\d^l{x) 



Lip(^) 

N 



N-l 



{xi+1 - Xt) = 



gLip(0 
N ' 



(4.21) 
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Hence it is immediate to see that 

£N{fi)= inf w^{^l,^l^)<^. 

We would like to extend this estimate to higher dimension d > 1. However, for 
multivariate measures there is no such an easy upper bound, see [33] and [35] for 
very general statements, and for the sake of simplicity we restrict here the class of 
measures n to certain special cases. As a typical situation, we address tensor product 
measures and sums of tensor products. 

Lemma 4.2. Let ^i,...,/ e ViiR) with < £j, j ^ ■ ■ ■ ,d for 

some iVi, . . . , iVd e N, ei, . . . , Ed > and /x^-^^ := ^ E^=o'' ■ Let N = Jlti 
Then 

d 

Wl{fl^®■■■®^i'^,^l^)<J2^J' 

where 

1 . . 

:= — ^ (5^ and X := J|{a;^, . . . , a;]^_ J. 

xex j=i 



Proof. The proof is based on a simple argument using a telescopic sum. For 
J = 1 , . . . , d 4- 1 we put 



vr--- 



1 ^ " /■ 



Of course, if j = 1, then the integration over W ^ is missing and if j = d + 1 then 
the summation becomes empty. Now 



„ -, Ni-l a 

/ mdKx) - -r— E ■ ■ ■ E ^(4 = E(^.-+i - 



nti .t^o 



together with the estimate |Vj+i — V, | < finishes the proof. □ 

Lemma l4.2l savs. roughly speaking, that the tensor products of sampling points of 

univariate measures are good sampling points for the tensor product of the univariate 

measures. Next lemma deals with sums of measures. 

Lemma 4.3. Let ^i, . . . G Vi{R'') with Wi{ni,fij^) < ei, I = 1, ... ,L /or 

some A G N, £i, . . . , £l > and := X^ilo^ ^a^i ; • Then 



L 

L 
1=1 

where 

L L 



7^ E '^^ = tE/^'^ a^icf A |J{a;i,o,---,a;i,Ar-i}. 



^ ■ LN^^^ L 

xex 1=1 1=1 
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Proof. We use the homogeneity of the Monge-Kantorovich-Rubinstein distance 
Wi{afi, av) = aWi{(i, v) for n^v ^ 7'i(R'') and a > combined with its subadditivity 
Wi{iJ,i+ IJ2, 1^1 + ^2) < Wi{ni,vi) + Wi{ii2,V2) for /^i,/^2,i^i,J^2 e ViiW^). We obtain 

;=i 1=1 

□ 

Next coroUary follows directly from Lemma 14.21 and Lemma 14.31 
Corollary 4.4. (i) Let n^, . . . , e 7'i(M) and Ni,...,NdeN. Then 

d 

EN{^i^ ®■■■®^Ji'^)<^£N,{^i^), where N -.^ Ni ■ ■ ■ Nd. 
(a) Let fii,...,fiL€ ViiR'^) and N e N. Then 

1=1 

4.5. Delayed curse of dimension. Although Lemma 14.21 Lemma 14.31 and 
Corollary 14. 41 give some estimates of the Monge-Kantorovich-Rubinstein distance be- 
tween general and atomic measures, the number of atoms needed may still be too 
large to allow the assumption (|4.14p in Theorem 14.11 to be fulfilled. Let us for exam- 
ple consider the case, where jj} = ■ ■ ■ = ^'^ hi Lemma [4.21 and £1 = • • • = =: e. 
Then, of course, A'^i = • • • = Nd =: Af and we observe, that the construction given in 
Lemma 14.21 gives an atomic measure, which approximates // up to the error de using 
N''^ atoms, hence with an exponential dependence on the dimension d. This effect is 
another instance of the well-known phenomenon of the curse of dimension. 

However, in many real-life high-dimensional applications the objects of study 
(in our case the measure fj, S Vc{R'^)) concentrate along low-dimensional subspaces 
(or, more general, along low-dimensional manifolds) [Sj El [El [20l [21]. The number 
of atoms necessary to approximate these measures behaves in a much better way, 
allowing the application of (|4.14p and Theorem 14.11 To clarify this effect, let us 
consider ij = /^"'^ • • • (g) /i'' with supp/x^ C [oj, bj] and define aj = bj — aj. Let us 
assume, that cri > (72 5^ ■ ■ ■ 5^ f d > is a rapidly decreasing sequence. Furthermore, 
let e > 0. Then we define k := fc(e) to be the smallest natural number, such that 

d 

J2 '^k< e/2 

k=k{s) + l 

and put Nk = 1 for k E {k{e) + 1, . . . , d}. The numbers iVi = • • • = ^^(^j = A/" are 
chosen large enough so that 

77 E ^ 
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Then Lemma 1321 together with (|4.20p state that there is an atomic measure with 

N = TV*'^^) atoms, such that 

d 

W,{^Ji,^l'')<Y^^<e/2 + e/2. (4.22) 

fe=i ^ 

Hence, at the cost of assuming that the tensor product measure /i is concentrated 
along a fc(e)-dimensional coordinate subspace, we can always approximate the mea- 
sure jjL with accuracy e by using an atomic measure supported on points whose number 
depends exponentially on = fc(e) <C d. However, if we liked to have e — >■ 0, then 
k{e) ^ d and again we are falling under the curse of dimension. This delayed kicking 
in of the need of a large number of points for obtaining high accuracy in the ap- 
proximation (j4.22p is in fact the so-called delayed curse of dimension^ expressed by 
assumption (|4.14p . a concept introduced first by Curbera in [24], in the context of 
optimal integration with respect to Gaussian measures in high dimension. 

Let us only remark, that the discussion above may be easily extended (with help 
of Lemma [4. Bp to sums of tensor product measures. In that case we obtain as atoms 
the so-called sparse grids, cf. [10]. Using suitable change of variables, one could also 
consider measures concentrated around (smooth) low-dimensional manifolds, but this 
goes beyond the scope of this work, see [33] for a broader discussion. 
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Abstract. In this paper we explore how concepts of high-dimonsional data compression via 
random projections onto lower-dimensional spaces can be applied for tractable simulation of cer- 
tain dynamical systems modeling complex interactions. In such systems, one has to deal with a 
large number of agents (typically millions) in spaces of parameters describing each agent of high 
dimension (thousands or more). Even with today's powerful computers, numerical simulations of 
such systems are prohibitively expensive. We propose an approach for the simulation of dynami- 
cal systems governed by functions of adjacency matrices in high dimension, by random projections 
via Johnson-Lindenstrauss embeddings, and recovery by compressed sensing techniques. We show 
how these concepts can be generalized to work for associated kinetic equations, by addressing the 
phenomenon of the delayed curse of dimension, known in information-based complexity for optimal 
numerical integration problems and measure quantization in high dimensions. 

Key words. Dimensionality reduction, dynamical systems, flocking and swarming, Johnson- 
Lindenstrauss embedding, compressed sensing, high-dimensional kinetic equations, delayed curse of 
dimension, optimal integration of measures in high dimension. 

AMS subject classifications. 34C29, 35B35, 35Q91, 35Q94, 60B20, 65Y20. 

1. Introduction. The dimensionality scale of problems arising in our modern 
information society has become very large and finding appropriate methods for dealing 
with them is one of the great challenges of today's numerical simulation. The most 
notable recent advances in data analysis are based on the observation that in many 
situations, even for very complex phenomena, the intrinsic dimensionality of the data 
is significantly lower than the ambient dimension. Remarkable progresses have been 
made in data compression, processing, and acquisition. We mention, for instance, 
the use of diffusion maps for data clouds and graphs in high dimension [5j [6l [191 HOl 
dU 133] in order to define low-dimensional local representations of data with small 
distance distortion, and meaningful automatic clustering properties. In this setting 
the embedding of data is performed by a highly nonlinear procedure, obtained by 
computing the eigenfunctions of suitable normalized diffusion kernels, measuring the 
probability of transition from one data point to another over the graph. 

Quasi-isometrical linear embeddings of high-dimensional point clouds into low- 
dimensional spaces of parameters are provided by the well-known Johnson-Lindenstrauss 
Lemma [I] [521 SO] '■ any cloud of N points in can be embedded by a random lin- 
ear projection M nearly isometrically into M'^ with k = C(e~^ log(A^)) (a precise 
statement will be given below). This embedding strategy is simpler than the use of 
diffusion maps, as it is linear, however it is "blind" to the specific geometry and local 
dimensionality of the data, as the embedding dimension k depends exclusively on the 
number of points in the cloud. In many applications, this is sufficient, as the number 
of points M is supposed to be a power of the dimension d, and the embedding produces 
an effective reduction to fc = 0{e~^ \og(N)) = 0{e~^ log(d)) dimensions. As clarified 
in [21 [31], the Johnson-Lindenstrauss Lemma is also at the basis of the possibility 
of performing optimal compressed and nonadaptive acquisition of high-dimensional 
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data. In compressed sensing [H [13 a vector a; G M'* is encoded in a vector y e M*^ by 
applying a random projection M, which is modehng a hnear acquisition device with 
random sensors, i.e., y = Mx. From y it is possible to decode x approximately (see 
Theorem 13 . 71 below) by solving the convex optimization problem 



x^ = arg mm 

Mz=y 




with the error distortion 



where aK{x)id ~ 'v[^iz:#sui:>p(z)<K \\z ~ ^Wif and K ~ 0{k/{\og{d/k) + 1)). We de- 
note = {z g R'' : #supp (z) < K} the set of A'-sparse vectors, i.e., the union of 
iC-dimensional coordinate subspaces in W^. In particular, if a; € E/^ , then x"^ = x. 
Hence, not only is M a Johnson-Lindenstrauss embedding, quasi-isometrical on point 
clouds and i^-dimensional coordinate subspaces, but also allows for the recovery of 
the most relevant components of high-dimensional vectors, from low-dimensional en- 
coded information. A recent work [H [59] extends the quasi-isometrical properties of 
the Johnson-Lindenstrauss embedding from point clouds and .ftT-dimcnsional coordi- 
nate subspaces to smooth compact Riemannian manifolds with bounded curvature. 
Inspired by this work, in [39] the authors extend the principles of compressed sensing 
in terms of point recovery on smooth compact Riemannian manifolds. 

Besides these relevant results in compressing and coding-decoding high-dimensional 
"stationary" data, dimensionality reduction of complex dynamical systems and high- 
dimensional partial differential equations is a subject of recent intensive research. 
Several tools have been employed, for instance, the use of diffusion maps for dynam- 
ical systems [48], tensor product bases and sparse grids for the numerical solution of 
linear high-dimensional PDEs [261 UHl [M] [35] , the reduced basis method for solving 
high-dimensional parametric PDEs [7] IQ] l46 l [53 ] [54 ] [56 ] . 

In this paper we shall further explore the connection between data compression and 
tractable numerical simulation of dynamical systems. Eventually we address the so- 
lutions of associated high-dimensional kinetic equations. We are specially interested 
in dynamical systems of the type 

N 

x,{t) - h{Vx{t)) + f,,{Vx{t))x,{t), (1.1) 
i=i 

where we use the following notation: 
9 N E N - number of agents, 

• x{t) = {xi{t),...,XN{t)) G W^""^, where x^ : [0,T] R'', i = 1, . . . , N , 

pNxN 



./.jiM^x^^R, i,j^l,...,N, 

• V : R'^^^ -> R^^^, Vx := {\\xi - Xj\\gd)fj^^ is the adjacency matrix of the 
point cloud x. 

We shall assume that the governing functions fi and are Lipschitz, but we shall 
specify the details later on. The system (|l.ip describes the dynamics of multiple com- 
plex agents x{t) = {xi{t), . . . ,XN{t)) E M''^^, interacting on the basis of their mutual 
"social" distance 'Dx{t), and its general form includes several models for swarming and 
collective motion of animals and micro-organisms, aggregation of cells, etc. Several 
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relevant effects can be included in the model by means of the functions fi and , 
in particular, fundamental binary mechanisms of attraction, repulsion, aggregation, 
self-drive, and alignment [T31 [HI [231 EOl HlI ■ Moreover, possibly adding stochastic 
terms of random noise may also allow to consider diffusion effects [U [14]. How- 
ever, these models and motion mechanisms are mostly derived borrowing a leaf from 
physics, by assuming the agents (animals, micro-organisms, cells etc.) as pointlike 
and exclusively determined by their spatial position and velocity in M'' for cZ = 3 -I- 3. 
In case we wished to extend such models of social interaction to more "sophisticated" 
agents, described by many parameters (d ^ 3 -I- 3), the simulation may become com- 
putationally prohibitive. Our motivation for considering high-dimensional situations 
stems from the modern development of communication technology and Internet, for 
which we witness the development of larger and larger communities accessing infor- 
mation (interactive databases), services (financial market), social interactions (social 
networks) etc. For instance, we might be interested to simulate the behavior of cer- 
tain subsets of the financial market where the agents are many investors, who are 
characterized by their portfolios of several hundreds of investments. The behavior of 
each individual investor depends on the dynamics of others according to a suitable 
social distance determined by similar investments. Being able to produce meaningful 
simulations and learning processes of such complex dynamics is an issue, which might 
be challenged by using suitable compression/dimensionality reduction techniques. 
The idea we develop in this paper is to randomly project the system and its initial 
condition by Johnson-Lindenstrauss embeddings to a lower-dimensional space where 
an independent simulation can be performed with significantly reduced complexity. 
We shall show that the use of multiple projections and parallel computations allows 
for an approximate reconstruction of the high-dimensional dynamics, by means of 
compressed sensing techniques. After we explore the tractable simulation of the dy- 
namical systems (jl.ip when the dimension d of the parameter space is large, we also 
address the issue of whether we can perform tractable simulations when the number 
N of agents is getting very large. Unlike the control of a finite number of agents, 
the numerical simulation of a rather large population of interacting agents {N ^ 0) 
can constitute a serious difficulty which stems from the accurate solution of a pos- 
sibly very large system of ODEs. Borrowing the strategy from the kinetic theory of 
gases [16], we want instead to consider a density distribution of agents, depending on 
their d-parameters, which interact with stochastic infiuence (corresponding to classi- 
cal collisional rules in kinetic theory of gases) - in this case the infiuence is "smeared" 
since two individuals may interact also when they are far apart in terms of their 
"social distance" Vx. Hence, instead of simulating the behavior of each individual 
agent, we shall describe the collective behavior encoded by a density distribution 
whose evolution is governed by one sole mesoscopic partial differential equation. We 
shall show that, under realistic assumptions on the concentration of the measure /x 
on sets of lower dimension, we can also acquire information on the properties of the 
high-dimensional measure solution ^ of the corresponding kinetic equation, by con- 
sidering random projections to lower dimension. Such approximation properties are 
determined by means of the combination of optimal numerical integration principles 
for the high-dimensional measure |331 136] and the results previously achieved for 
particle dynamical systems. 
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1.1. Fundamental assumptions. We introduce the following notation for £p- 
norms of vectors v € R'', 



i/p 

for 1 < p < oo, 



and 



\\v\\fd := max wA. 
=^ i=i,...4 

For matrices x g K"><™ we consider the mixed norm 

Mi^ii-,) :=ll(ll^.|l^,")"ilUr' 

where Xi € R" is the i*''-column of the matrix x. 

For the rest of the paper we impose three fundamental assumptions about Lips- 
chitz and boundedness properties of fi and /y , 

\h{a)- h{b)\<L\\a~Hi^{i^). * = l,...,iV (1.2) 

N 



max V|/,,(a)|<L', (1.3) 

%—l,...,N ^ — ^ 
J = l 

JV 

. "la^.E - /'^^■(^)l ^ - ^lle(^Si)' (1-4) 

i—l,...,N ^ — ^ °^ °° 

J=l 

for every a,b £ R^'^^. Unfortunately, models of real-life phenomena would not 
always satisfy these conditions, for instance models of financial markets or socio- 
economic interactions can be expected to exhibit severely discontinuous behavior. 
However, these assumptions are reasonable in certain regimes and allow us to prove 
the concept we are going to convey in this paper, i.e., the possibility of simulating 
high-dimensional dynamics by multiple independent simulations in low dimension. 

1.2. Euler scheme, a classical result of stability and convergence, and 
its complexity. We shall consider the system of ordinary differential equations of 
the form (jl.ip with the initial condition 

x,iO) = x°, i = l,...,N. (1.5) 

The Euler method for this system is given by (|1.5p and 



N 

h{Vx-)+Y^Uj{Vx-)x^ 

3 = i 



,no-l. (1.6) 



where /i > is the time step and tiq T/h is the number of iterations. We con- 
sider here the explicit Euler scheme exclusively for the sake of simplicity, for more 
sophisticated integration methods might be used. 

The simulation of the dynamical system (|1.6p has a complexity which is at least 
the one of computing the adjacency matrix Vx^ at each discrete time t", i.e., 0{d x 
A^^). The scope of the next sections is to show that, up to an e-distortion, we can 
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approximate the dynamics of (jl.ip by projecting the system into lower dimension and 
by executing in parahel computations with reduced complexity. Computation of the 
adjacency matrix in the new dimension requires only 0{e~^ log(iV) x iV^) operations. 
Especially if the distortion parameter e > is not too small and the number of agents 
is of a polynomial order in d, we reduce the complexity of computing the adjacency 
matrix to C'(log(d) x iV^). 

2. Projecting the Euler method: dimensionality reduction of discrete 
dynamical systems. 

2.1. Johnson-Lindenstrauss embedding. Wc wish to project the dynamics 
of (jl.ip into a lower-dimensional space by employing a well-known result of Johnson 
and Lindenstrauss |40j . which we informally rephrase for our purposes as follows. 

Lemma 2.1 (Johnson and Lindenstrauss). Let V be an arbitrary set of M points 
in . Given a distortion parameter e > 0, there exists a constant 

fco = 0(e-2log(AA)), 
such that for all integers k > ko, there exists a k x d matrix M for which 

(1 - e)\\x - x\\% < \\Mx - Mi\\% < (1 + e)\\x - (2.1) 
for all x^ x ^ V . It is easy to sec that the condition 

(1 - e)\\p\\l, < ||Afp|l| < (1 + e)|b||2,, p e (2.2) 

implies 

(l-e)llpll,. < llAfpll,. < (l + e)||p||,., peM'*, (2.3) 

for < e < 1, which will be used in the following sections. On the other hand, (|2.3p 
implies (|2.2p with 3e instead of e. 

Our aim is to apply this lemma to dynamical systems. As the mapping M from 
Lemma [2.1l is linear and almost preserves distances between the points (up to the s > 
distortion as described above), we restrict ourselves to dynamical systems which are 
quasi-linear or whose non-linearity depends only on the mutual distances of the points 
involved, as in (|l.ip . 

Let us define the additional notation, which is going to be fixed throughout the 
paper: 

• d & N - dimension (large) , 

• e > - the distortion parameter from Lemma 12.11 

• k Cz N - new dimension (small), 

• M g R*^^"* - randomly generated matrix as described below. 

The only constructions of a matrix M as in Lemma 12.11 known up to now are 
stochastic, i.e., the matrix is randomly generated and has the quasi-isometry property 
(|2.ip with high probability. We refer the reader to [3S] and [TJ Theorem 1.1] for two 
typical versions of the Johnson-Lindenstrauss Lemma. 

We briefiy collect below some well-known instances of random matrices, which 
satisfy the statement of Lemma [2. II with high probability: 

• k X d matrices M whose entries ruij are independent realizations of Gaussian 
random variables 
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k X d matrices M whose entries are independent realizations of ± Bernoulli 
random variables 



mi, 7 := 



+ -^, with probability ^ 
— with probability i 



Several other random projections suitable for Johnson-Lindenstrauss embeddings 
can be constructed following Theorem 13.61 recalled below, and we refer the reader to 
[33] for more details. 

2.2. Uniform estimate for a general model. If M G R*^^'' is a matrix, we 
consider the projected Euler method in R*^ associated to the high-dimensional system 
(fTSjl - pTe)) . namely 

y° := Mxl (2.4) 

N 



■■=y? + h 



A//,(2?'y") + ^/,,(I?'2/")y," 



71 = 0,..., 710-1. (2.5) 



We denote here V : R''^'^ M^^^, 2?'?; := {\\y, - yj\\t,k)^j=i, the adjacency matrix 
of the agents y = (j/i, . . . , y^) in K.'^^'^. The first result of this paper reads as follows. 
Theorem 2.2. Let the sequences 

{a;", i = 1, . . . , N and 71 = 0, ... , 7io} and {y", i — 1, . . . , N and n — 0, . . . , uq} 

be defined by pT5)) - pr^ and ((^ -(^3 1) with and satisfying pT^ - pTi)) and a 

matrix M e M''^'* with 

||A//,(2?V) - Mh{Vx-)\\,, <(! + £) ||/.(P'j/") - h{Vx-)h. , (2.6) 
||Mx;'||,. <(l + £)||x;'|i,., (2.7) 
(1 - e)\\x\^ - ^^hi < P/xr - Mx]\\,. < (1 + e)\\x: - x]\\,. (2.8) 

for all i, j = 1, . . . , and all n ~ 0, . . . ,nQ. Moreover, let us assume that 

a > max ||a;"||£d for all ?i = 0, ...,no, j — l,...,N. (2.9) 

j J 2 

Let 

el -Wy^ - Mx'IWi,,, t = l,...,N andn = 0,...,no (2.10) 



and set := max-; e". Then 



<ehnBcxpihnA), (2.11) 

where A:=L' + 2(1 + £){L + aL") and B := 2a(l + e)(L + ai"). 

We remark that conditions p.6p - p.8p are in fact satisfied as soon as M is a 
suitable Johnson-Lindenstrauss embedding as in Lemma l^TTl for the choice J\f — 2NnQ 
andfc = C'(e-2log(AA)). 
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Proof. Using ((^1^ and (HHJ-ldll) and combined with (US]) and 

we obtain 



N 



^/,,(I?'y")2/;-/,,(I?a;")Ma 



<er + Ml + £)||/.p'y")"/«(2?2;")ll,. 
<e- + h{l + e)mV'y^)-MVxnh. 

N 

+ hY,[\f,,{V'y")\e] + (1 + s)\\x]\\,. ■ IMV'yn ~ 
i=i 

Taking the maximum on both sides, this becomes 

We use (|1.2p - (|1.4p for a = V'y" and 5 = 2?x" to estimate all the terms on the 
right-hand side. This gives 

^n+i < ^ ^ e)L\\V'y'' - + h£''L' + h{l + e)aL"\\V'y" - 

< £"{1 + hL') + h{l + e){L + aL") [||2?'y" - 2?'A/a;"|lf« ) + ||I?'Afx" - (^«)] 

< £"(1 + /iL') + 2h{\ + £)(L + " + ae), 

where we used (|2.8p in the last line. This, together with £^ = 0, leads to 

£" < ehnBexp{hnA), 

where A -.^ L' + 2(1 + e){L + aL") and B 2a(l + £)(L + aL"). □ 

2.3. Uniform estimate for the Cucker-Smale model. As a relevant exam- 
ple, let us now show that Thcorcm l2.2l can be applied to the well-known Cucker-Smale 
model, introduced and analyzed in [22j [23] , which is described by 



1 ^ 



g{\\x,-Xj\\t,d){vj-Vi), i = l,. 



,N. 



(2.12) 
(2.13) 



The function g : [0, od) — > R is given by g{s) ~ -fj:^^2yn fo^' > 0; and bounded by 
g{0) = G > 0. This model describes the emerging of consensus in a group of interacting 
agents, trying to align (also in terms of abstract consensus) with their neighbors. One 
of the motivations of the model from Cucker and Smale was to describe the formation 
and evolution of languages [531 Section 6] , although, due to its simplicity, it has been 
eventually related mainly to the description of the emergence of flocking in groups of 
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birds 122] • In the latter case, in fact, spatial and velocity coordinates are sufficient to 
describe a pointlike agent (d = 3 + 3), while for the evolution of languages, one would 
have to take into account a much broader dictionary of parameters, hence a higher 
dimension d ^ 3 + 3 of parameters, which is in fact the case of our interest in the 
present paper. 

Let us show that the model is indeed of the type We interprctc the system 

as a group of 27V agents in R'', whose dynamics is given by the following equations 

N 
N 

with f^^iVx) := E-9(ll^^ " ^^H^?)' /^(^^) ^= J^aiW^^ " ^^H^^)' 

fe=i 

for i ^ j. The condition (|1.2p is empty, (|1.3p reads 

L' > max(l,2G) > max 1 1, | - x^W,.)^ . 

Finally, 

2 ^ I 

911 II ^ 

< max^lp . Y,\\\x- - x]\\,. Wy- - y-\\,. 

<2||.g||Lip-||W-I?a:;"|l,«(,«) 

shows that L" < 2||g||Lip- The boundedncss of the trajectories in the phase-space of 
(|2.12p - (|2.13p at finite time has been proved, for instance, in [37], see also [iHl Theorem 
4.6]. The boundedncss at finite time is clearly sufficient to define the constant a 
appearing in Theorem 12. 2[ also because we are mainly interested in the dynamics 
for short time, due to the error propagation. Of course the constant a might grow 
with time, but, for instance, for the Cucker-Smale system it grows at most linearly 
in time [14j : as in the error estimate (|2.1ip we have an exponential function in time 
appearing, the possible linear growth can be considered a negligible issue; moreover, 
as our numerical experiments show, see Section 13. 5[ the situation is much better 
in practice, and suitable scaling, as indicated below, allows us to assume in several 
circumstances that the constant a is uniformly bounded for all times. In fact, even 
when we were interested in longer time or even asymptotical behavior, especially when 
pattern formation is expected, then we would observe the following additional facts: 
In the Cuckcr-Smalc model the center of mass and the mean velocity are invariants 
of the dynamics. Moreover the rate of communication between particles is given by 
9i^) ~ {i+s^)f • When /? < 1/2 it is know (see [14]) that the dynamics will converge 
to a flocking configuration. In this case one can translate at the very beginning the 
center of mass and the mean velocity to 0, and the system will keep bounded for all 
times. Hence in this case the constant a can also be considered uniform for all times 
(not only bounded at finite time). 
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2.4. Least-squares estimate of the error for the Cucker-Smale model. 

The formula (|2.11[) provides the estimate of the maximum of the individual errors, i.e., 
£" := 11(2/" — In this section we address the stronger £2 (^2)"6stimate 
for the error. For generic dynamical systems (jl.ip such estimate is not available in 
general, and one has to perform a case-by-case analysis. As a typical example of 
how to proceed, we restrict ourselves to the Cucker-Smale model, just recalled in the 
previous section. The forward Euler discretization of (|2.12p - (|2.13p is given by 



,,n + l 



(2.14) 



with initial data and given. Let M be again a suitable random matrix in the 
sense of Lemma l2.1l The Euler method of the projected system is given by the initial 
conditions = Mx'^ and = A/?j,° and the formulas 



(2.15) 



N 



^E-9(iiyr-y"ii.,0K-o- 



We are interested in the estimates of the following quantities 



Mv]' 



CI: 



N 



\ 



N 



N 



|«-M<)f^ill,«(,. 



TV 



(2.16) 



(2.17) 



Theorem 2.3. Let the sequences {a;f }, {wf}, {yf}, {e",i} and {e" J, i = 

1,...,N and n = l,...,no be given by (plU) . (p7I^ . ((2ll)) and 0J7]\ . respectively. 
Let e > and let us assume, that the matrix M satisfies 

(1 ~ e)|j< - x]h. < \\Mx: ~ Mx]\\,. < (1 + e)\\x^ - x]\\,. and 

(1 - e)\K - v^Wii < - ^'Hh ^ (1 + - ""ll^f 

for all i, j ~ 1, . . . , N and n = 0, . . . , riQ. 

Then the error quantities f " and £y introduced in (j2.16p and (|2.17p satisfy 



^{£^f + (£„")2 < £(1 + e)hn\\g\\upVXcMhn\\A\\), 



where V := max- 



A 



1 

2(l+e)||5llLip^ 2G^ 



Proof Using (|2.14p and p.lSp . we obtain 

el't' < e" , + he: , and C*"' < C + /^C- 



(2.18) 
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To bound the quantity wc have to work more. We add and subtract the term 
giWyf - V'jWu^iMv]^ - Mvf) and apply (pji)) and I^JQ. This leads to 

h ^ 

-5(||xr-x7||,.)(Aft;;'-A/<)||,. 
h ^ 

^ + 1^ E-9(iiyr - yl\k)i<,3 + <^) (2-19) 



N 



i=i 

We estimate the first summand in (|2.19p 

, N ] C ^ hC ^ 

^ Effdl^^r - y;il.,0(e", + <.) < ^ [iVe^:,, + ^ el^] ^ hGel, + ^ E 
and its ^2-norm with respect to i by Holder's inequality 

n' 



/iV^GC + ^(E(E<.) 1 <2/ix/iVG£;. (2.20) 



To estimate the second summand in p.lOp we make use of 

|ll^r-^"ll.f-||yr-y"ll.,^| 

< |ll< - 411^;; - - Mx1\U\ + IllAf^r - Mx"\U - ll^r - y"\\ 



We arrive at 

N 

llffllLip 

TV 



^^^'^^"^""'^ E II"" - - x-\\,i + + el,) 

< (^+--yi^-p^ (.f li.r - 411..^ + + E <. 



j=i i=i 



The €2-norm of this expression with respect to i is bounded by 



(E(Eii-"--"ii^.^) ) +^(E««:)') +^E4'. 

I z^i j;'— 1 i— 1 1 



N I 

< (1 + £)/i||5||LipV^ViV(eX + 2£^). (2.21) 
Combining (|2.19p with (|2.20p and (|2.2ip leads to the recursive estimate 

< s: + hs:, (2.22) 

< C + + h{l + e)||.g||Lip'^^ {eX + 28^} , 
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which we put into the matrix form 



Sh') - (I) + ((1 + e)eh\\g\\^,,V^ ' ^'^^^'^ 



where is a 2 x 2 matrix given by 
A! ^Id + hA : 
Taking the norms on both sides of (j2.23p leads to 



l) +''(2(l + £)||g||Lipy 2G 



+ < (1 + h\\A\\W{E^Y + {8^Y+e{\ + e)h\\gU,^VX, 

which gives the least-squares error estimate (|2.18p . □ 

3. Dimensionality reduction for continuous dynamical systems. 

3.1. Uniform estimates for continuous dynamical systems. In this section 
we shall establish the analogue of the above results for the continuous time setting of 
dynamical systems of the type (jl.ip . 

N 

= /,(2?x)+^/,;,(27x)x,, i = l,...,N, (3.1) 

a;i(0)=x°, i = l,...,N. (3.2) 

We adopt again the assumptions about Lipschitz continuity and boundedncss of the 
right-hand side made in Section [2j namely (|1.2p . (|1.3p . and (|1.4p . 

Theorem 3.1. Let x(t) G K'^^^, t e [0,r], be the solution of the system ([XT]) - 
p.2p with fi 's and fij 's satisfying (|1.2p - (jl.4p . such that 

max max\\xi(t) — Xj(t)\\pd <a. (3-3) 
te[o,T] i,j 2 

Let us fix k G N, k < d, and a matrix M G R*''^'' such that 

(1 - e)\\x4t) - x,{t)\\,. < \\Mx,it) - Mx,it)\\,. < (1 + e)\\x,it) - x,it)\\,. ,(3.4) 

for all t e [0, T] and i, j = I,..., N. Let y{t) G R''''^, t G [0, T] be the solution of 
the projected system 

N 

= Mf^iV'y) + ^ /„(I?'2/)% , i = l,...,N, 

y,(0) = Afa:^ i = l,...,iV, (3.5) 
such that for a suitable /3 > 0, 

max ||j/(i)|L„,„d-, < /3. (3.6) 

tG[0,T] ""=ooi«2J 

Let us define the column-wise £2-error ei{t) := \\yi — AfxiH^i- for i = 1, . . . , N and 
£{t) := jnax^ei(t) = \\y ~ A/x-||^„^^fc^ . 
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Then we have the estimate 

8{t) < eat{L \\M\\ + L" 13) exp [(2L ||A/|| + 2/?L" + L')t\ 



(3.7) 



Proof. Due to (|1.2p - (|1.4p . we have for every i = 1, . . . , TV the estimate 



dt 



{y,- Mx,,f^{y, - Mxj)) 
lly, - Mx^\\i,u 



< 



— iy,-Mx,) 



N 



< 



WMMV'y) - Mf,iVx)\\,. + \\hi'D'y)yj - hj{Vx)Mx,\\,. 



N 



< 



L \\M\\ \\V'y - + J2 {\\nj{'Dx){Mx, - y,)\\,. + m.iVx) - f,,{V'y))y,\\,. 



< L \\M\\ WV'y - 'Dx\\^N(^iN) + L' \\Mx - J/||fN(ffc) + L" j|2?x - WvWi'^ii'^) 

The term \\V'y -VxWgN (^^n) < WV'y - V Mx\\^m i^^n ) + WV Mx ~ VxW^m (^^n ) is esti- 
mated by 



\V'y-VMx\\ 



max 



< max \\yi — Mxi\\(k + \\yj — Mxj\\ik < 2£{t) , 



and, using the assmnption p.4p . 



\V'Mx -Vx\\ 



max 



\\Mx^- MxjWgk - \\xi-Xj\\t 



< £ max x i — X 



e||2?a;|| 



Finally, by the a priori estimate p.3p for HPxH^jv (^n 
obtain 



and 



for we 



dt 



ei<L \\AI\\ {2£{t) + ea) + L'£(t) + L" (3{2£{t) + ea) 
= {2L \\M\\ + 2PL" + L')£{t) + ea{L \\M\\ + L" p) . 



Now, let us split the interval [0,r) into a union of finite disjoint intervals Ij = 
[tj-i,tj), j = 1,...,K for a suitable K £ N, such that £{t) = ei(^j){t) for t E Ij. 
Consequently, on every we have 

(t) = ^e,(,)(i) < {2L \\M\\ + 2/3i" + L')£{t) + ea{L + , 
and the Gronwall lemma yields 

£{t) < [ea{L \\M\\ + L" P){t - tj^i) + £{tj^i)\ exp {{2L \\M\\ + 2/3L" + L'){t - tj_i)) 

for t € [tj-i,tj). A concatenation of these estimates over the intervals Ij leads finally 
to the expected error estimate 

£{t) < eat{L \\M\\ + L" f3) exp [{2L \\M\\ + 2/3L" + L')t] . 



□ 
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3.2. A continuous Johnson-Lindenstrauss Lemma. Let us now go through 
the assumptions we made in the formulation of Theorem 13 . 1 1 and discuss how they re- 
strict the validity and applicability of the result. First of all, let us mention that p.3p 
and (j3.6[) can be easily proven to hold for locally Lipschitz right-hand sides fi and 
fij on finite time intervals. Obviously, the critical point for the applicability of The- 
orem [33] is the question how to find a matrix M satisfying the condition p.4[) . i.e., 
being a quasi-isometry along the trajectory solution x{t) for every t g [0,r]. The an- 
swer is provided by the following generalization of the Johnson-Lindenstrauss Lemma 
(Lemma 12. ip for rectifiable C^-curves, by a suitable continuity argument. Let us 
stress that our approach resembles the "sampling and e-net" argument in [3l IH [59] for 
the extension of the quasi-isometry property of Johnson-Lindenstrauss embeddings to 
smooth Riemmanian manifolds. From this point of view the following result can be 
viewed as a specification of the work [U [SH] . 
We first prove an auxiliary technical result: 

Lemma 3.2. Let Q < e < e' < 1, a e and let M : R'' M'^' be a linear 
mapping such that 

(l-£)||a||,. < ||A/a||,. < (l + £)||a||,.. 

Let X eM.'^ satisfy 

is' — e)||a|Ld 

\\a~x\\ < , „ (3.8) 
" " - \\M\\ + l+e' ^ ' 

Then 

{l-e')\\x\\,. < \\Mx\\,. < (l + £')ll^ll.- (3.9) 



Proof. If a = 0, the statement is trivial. If a 7^ 0, we denote the right-hand side 
of (|3.8|) by r > and estimate by the triangle inequality 

\\Mx\\,. _ \\M{x-a) + Ma\\,. ^ \\M\\ ■ ||x - a||,. + (1 + £)||aH,. 



||a:||^d \\x-a + a\\id - - 

|jM|j •r + (l+e)||a||^d 
< - — „ „ < 1 + e' . 

A similar chain of inequalities holds for the estimate from below. □ 
Now we are ready to establish a continuous version of Lemma 12.11 
Theorem 3.3. Let (p : [0, 1] ~^R'^ be a curve. Let < e < e' < 1, 

WiOWii r 7 

7 := max - — ^ < 00 and M > (\'d + 2)- — . 

MOWii ~ ' e'-e 

Let k be such that a randomly chosen (and properly normalized) projector M satisfies 
the statement of the Johnson-Lindenstrauss Lemma \2.1\ with e,d,k and M arbitrary 
points with high probability. Without loss of generality we assume that \\M\\ < \fdjk 
within the same probability (this is in fact the case, e.g., for the examples of Gaussian 
and Bernoulli random matrices reported in Section\^. 
Then 

(1 - e')Mt)\\,. < WM^m,. <{1 + e')Mt)\\,., for all t e [0, 1] (3.10) 
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holds with the same probability. 

Proof. Let ti = i/M, i — Q, . . . ,Af and put 

:= arg max^gfj^ t^^^]||(p'(^)||^d, i = 0, . . . , A/" - 1. 

Let M : R'' M'^' be the randomly chosen and normahzed projector (see Lemma [2TT|) . 
Hence ||M|| < i/d/^ and 

(1 - e')Mm,. < ||M(^(T,))||,. < (1 + e')MT,)\\,., i = l,...,N (3.11) 

with high probabihty. We show that p.lOp holds with (at least) the same probability. 

This follows easily from p. lip and the following estimate, which holds for every 
t G [ti, ti+i], 

ll^(i) - (p(r,)|l^d < 11^ {s)\\iids < — < — 2) — 

^ MT,)\\,.{e' ~ e) ^ MT,)\\,.{e' ~ e) 
Vd + 2 - p/|| + l + £' 

The proof is then finished by a straightforward application of Lemma 13.21 □ 
Remark 1. We show now that the condition 

7 := max „ ,^.,| - < oo 

is necessary, hence it is a restriction to the type of curves one can quasi-isometrically 
project. Let d > 3. It is known that there is a continuous curve ip : [0, 1] — >■ [0, 1]''""'^, 
such that (p{[0,l]) ~ [0,1]''"^, i.e., (p goes onto [0,1]''"^. The construction of such a 
space-filling curve goes back to Peano and Hilbert. After a composition with suitable 
dilations and d-dimensional spherical coordinates we observe that there is also a sur- 
jective continuous curve (p : [0,1] — > E>'^~^, where denotes the £2 unit sphere in 

As M was supposed to be a projection, p.lOp cannot hold for all t's with ip{t) € 
ker M 

Obviously, the key condition for applicability of Theorem 13.31 for finding a pro- 
jection matrix M satisfying p.4p is that 



1 1 X'i X jy\^ £d 

sup max-r < 7 < 00 . (3.12) 

te[o,T] *J \\Xi~Xj\\(d 

This condition is, for instance, trivially satisfied when the right-hand sides /i's and 
/ij's have the following Lipschitz continuity: 

Wf^iVx) ^ f,{Vx)\\,. <L"'\\x,- Xj\\i. foraU^,J = l,...,7V, 
\f^A'^^) ~ hA'Dx)] < L""\\x, - a-, II,. for alH, J, fc - 1, . . . , Af. 

We will show in the examples below how condition p.l2p is verified in cases of dynami- 
cal systems modeling standard social mechanisms of attraction, repulsion, aggregation 
and alignment. 
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3.3. Applicability to relevant examples of dynamical systems describ- 
ing social dynamics. In this section we show the applicability of our dimensionality 
reduction theory to well-known dynamical systems driven by "social forces" of align- 
ment, attraction, repulsion, aggregation, and self-drive. Although these models were 
proposed as descriptions of group motion in physical space, the fundamental social 
effects can be considered as building blocks in the more abstract context of many- 
parameter social dynamics. It has been shown [Mj [50] that these models are able to 
produce meaningful patterns, for instance mills in two spatial dimensions (see Fig- 
ure [XT]) i reproducing the behavior of certain biological species. However, we should 




Fig. 3.1. Mills in nature and in models 



expect that in higher dimension the possible patterns produced by the combination 
of fundamental effects can be much more complex. 

3.3.1. The Cucker-Smale system (alignment efTect). As shown in Sec- 
tion[2J the Cucker and Smale flocking model (|2.12p - (|2.13p is of the type (jl.ip . satisfies 
the Lipschitz continuity assumptions (|1.2p - (|1.4p . and it is bounded at finite time, as 
already discussed in Section 12.31 Therefore, to meet all the assumptions of Theo- 
rem [O] we only need to check that it also satisfies the condition p.l2p . However, 
for this we need to consider a slightly different framework than in Section [23] instead 
of considering the 2A'' d-dimensional variables {N position variables and N velocity 
variables), we need to arrange the model as N variables in R^'^, each variable consist- 
ing of the position part (first d entries) and of the velocity part (the other d entries) . 
We have then 



W^i - ^oWii + \Wi - < \\vi - vj 



< \\v, - V, 



< \\v, - f, 



< \\vi - Vj 

for a suitable constant c depending 
boundedncss of the term I 



1 ^ 



k=l 



N 



\\9\\up 
N 



' N \ 
^llwfcllfrf j \\Xi-Xj\\t,d 



\gd + c\xi — ll^d , 

on the initial data. We used here the a-priori 
Wfcll^d ), see [23] or [38] for details. Consequently, 



16 



M. FORNASIER, J. HASKOVEC AND J. VYBIRAL 



we can satisfy (j3.12p with 7 — max(l,c). 

3.3.2. Second order dynamic model with self-propulsion and pairwise 
interactions (self-drive, attraction, and repulsion effects). Another practi- 
cally relevant model which fits into the class given by (|l.ip is a second order dynamic 
model with self-propulsion and pairwise interactions, |45[ I50| : 

= Vi , (3.13) 

v, = {a~b\\v,\\^^,)v,~j^J2\/,,U{\\x,-Xj\\^,), i^l,...,N, (3.14) 

where a and b are positive constants and U : [0, 00) — > M is a smooth potential. We 
denote u{s) = U'{s)/s and assume that u is a bounded, Lipschitz continuous function. 
We again arrange the model as a system of N variables in M^'^, each variable consisting 
of the position part (first d entries) and of the velocity part (the other d entries). 
Consequently, the model can be put into a form compliant with (|l.ip as follows: 

N 

N N 



with f[f = S,j, f^^iVx) = -jf Y.j^i u{\\x^ - Xj\\g^) and f^j'iVx) = jfU{\\xi - Xj\\i^) 
for i ^ j. Moreover, we may set /™(I'w) = Sij{a—b\\vi\\'ja) by introducing an auxiliary, 

noninfluential constant zero particle {xo,vo) ~ (0,0) with null dynamics, i.e., /q* — 
and /q* = 0, where *,* g {x,v}. Then, (|1.2p is void, while (|1.3p is satisfied by 

m^xY,{\f^;{Vx,Vv)\ + \f^;{Vx,Vv)\ + \f^f {Vx,Vv)\) 
j 

< l + a + bmax\\vi\\^^i +2||u||^^ < L' , 

since the theory provides an apriori bound on f3y := supjgjQ -^j max^ Ht'illfd, see |50| . 
Condition (|1.4[) for /[^^ is void, while for /™ it is satisfied by 

max^ |/™(I?«) - f^f{Vw)\ < 6max \\v,\f,, - \\w,f,. 



3 



< 6max(^|ii;i||^d + \\wi\\idj \\v, - w,\\gi 

< L" WVv - 'Dw\\^N(^iN^ , 

where we again use the apriori boundedness of /?„. For /^^^ is (|1.4p satisfied by 
^ 2 ^ I 



< 2|lM!lLipll^'2;-2?y||^«(^„; 
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Finally, it can be easily checked that condition p.l2p is satisfied by 

-ijllfd + -Vj\\g^ < {I + a + 3bl3l)\\v, - Ujll^d + (\\u\\^^ + 2/3^ IMup) W^^ ~ ^oWii > 

where fix sup^^ jq -^j max^ ||xi||^d. We notice that also this model is bounded at 
finite time as shown in jl31 Theorem 3.10 (formula (22))], and therefore for any 
fixed horizon time T, there is a constant a = a{T) > such that (|2.9p and p.3p 
hold. In the paper |50j it is shown that this model tends to produce patterns of 
different quality, in particular mills, double mills, and translating crystalline flocks 
(see also Figure [XT]) . These patterns were further studied in [TS]. Starting from the 
Liouville equation for the many-body problem the authors derive the corresponding 
kinetic and macroscopic hydrodynamic equations. The kinetic theory approach leads 
to the identification of macroscopic structures otherwise not recognized as solutions 
of the hydrodynamic equations, such as double mills of two superimposed flows. The 
authors found conditions allowing for the existence of such solutions and compared 
them to the case of single mills. In [TJ the authors utilize the methods of classical 
statistical mechanics to connect the individual-based models of the type (|3.13l) - (|3.14p 
to their continuum formulations and determine criteria for the validity of the latter. 
They show that H-stability of the interaction potential plays a fundamental role in 
determining both the validity of the continuum approximation and the nature of 
the aggregation state transitions. They perform a linear stability analysis of the 
continuum model and compare the results to the simulations of the individual-based 
one. 

Without entering into further details, let us stress that mills and double mills are 
uniformly bounded in time (and stable). Hence in these cases, we can assume that 
actually the constant a is again bounded for all times. Moreover, when the dynamics 
converges to a translating crystalline flocks, we may reason in a similar way as done 
for the Cuckcr-Smale model (although in this case the pattern in unstable). 

3.4. Recovery of the dynamics in high dimension from multiple simula- 
tions in low dimension. The main message of Theorem 13. II is that, under suitable 
assumptions on the governing functions fi, fij , the trajectory of the solution y{t) of the 
projected dynamical system (|3.5p is at an e error from the trajectory of the projection 
of the solution x{t) of the dynamical system (|3.ip - (|3.2[) . i.e., 

y,(t) w Mx,(t) or, more precisely, \\Mxi{t) - y^(t)||^fc < C(t)e, t e [0,T]. (3.15) 

We wonder whether this approximation property can allow us to "learn" proper- 
ties of the original trajectory x{t) in high dimension. 

3.4.1. Optimal information recovery of high-dimensional trajectory from 
low-dimensional projections. In this section we would like to address the following 
two fundamental questions: 

(i) Can we quantify the best possible information of the high-dimensional tra- 
jectory one can recover from one or more projections in lower dimension? 

(ii) Is there any practical method which performs an optimal recovery? 

The flrst question was implicitly addressed already in the 70 's by Kashin and later 
by Garnaev and Gluskin [411 132j , as one can put in relationship the optimal recovery 
from linear measurements with Gelfand width of £p-balls. see for instance [H]. It was 
only with the development of the theory of compressed sensing |121 127j that an answer 
to the second question was provided, showing that ^i-minimization actually performs 
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an optimal recovery of vectors in high dimension from random hnear projections to 
low dimension. We address the reader to [311 Section 3.9] for further details. In 
the following wc concisely recall the theory of compressed sensing and we apply it to 
estimate the optimal information error in recovering the trajectories in high dimension 
from lower dimensional simulations. 

Again a central role here is played by (random) matrices with the so-called Re- 
stricted Isometry Property RIP, cf. |llj . 

Definition 3.4 (Restricted Isometry Property). A k x d matrix M is said to 
have the Restricted Isometry Property of order K < d and level S e (0, 1) if 

(1 - 5)||a;||| < ||Ma;||| < (1 + 5)||x||| 

for all K -sparse x e 12k ^ {z £ R'^ : #supp (z) < K}. 

Both the typical matrices used in Johnson-Lindenstrauss embeddings (cf. Lemma 
12. ip and matrices with RIP used in compressed sensing are usually generated at 
random. It was observed by [3] and j44| , that there is an intimate connection between 
these two notions. A simple reformulation of the arguments of [3] yields the following. 

Theorem 3.5 (Baraniuk, Davenport, DeVore, and Wakin). Let M be a k x d 
matrix drawn at random which satisfies 

(1 - S/2)\\x\\% < llA/xlll < (1 + 5/2)|lx|||, xer 

for every set V C with j^V < (^)^ with probability < ly < 1. Then M 
satisfies the Restricted Isometry Property of order K and level S/3 with probability at 
least equal to v. 

Combined with several rather elementary constructions of Johnson-Lindenstrauss 
embedding matrices available in literature, cf . [1] and |25j , this result provides a simple 
construction of RIP matrices. The converse direction, namely the way from RIP 
matrices to matrices suitable for Johnson-Lindenstrauss embedding was discovered 
only recently in [44] . 

Theorem 3.6 (Krahmer and Ward). Fix 77 > and e > 0, and consider a finite 
set V C R"^ of cardinality \V\ = TV. Set K > 40 log and suppose that the k x d 

matrix M satisfies the Restricted Isometry Property of order K and level 5 < e/4. 
Let ^ £ M."^ be a Rademacher sequence, i.e., uniformly distributed on {—1, l}'^ . Then 
with probability exceeding 1 — 77, 

(l-£)i|.i:||2, < ||Mx||| < (l + e)||a;||2,. 

uniformly for all x £ V , where M M diag(^), where diag(^) is a d x d diagonal 
matrix with ^ on the diagonal. 

We refer to [5T] for additional details. 

Remark 2. Notice that M as constructed in Theorem \ 3.6] is both a Johnson- 
Lindenstrauss embedding and a matrix with RIP, because 

(1 - S)\\x\\l, - (1 - (5)11 diag(Ox||| < II Mdiag(e)x||| 

■.=M 

< (1 + J)||diag(e)x||| = (l + <5)||.x|||. 

The matrices considered in Section [H satisfy with high probability the RIP with 

k 



K = 



1 + \og{dlk) 
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Equipped with the notion of RIP matrices we may state the main resuh of the 
theory of compressed sensing, as appearing in f28| . which we shah use for the recovery 
of the dynamical system in Mf^. 

Theorem 3.7. Assume that the matrix M e E''^'' has the RIP of order 2K and 
level 



62K < ^ « 0.4627. 




Then the following holds for all x £ M''. Let the low- dimensional approximation 
y = Mx + rj be given with \\ri\\gk < Ce. Let x"^ he the solution of 



mm 



llzjl^d subject to \\Mz — uW^k <\\rj\\^k. (3.16) 



aK{x)i 

fd < Cl£ + C2 ' 



Then 



K 

for some constants Ci,C2 > that depend only on52K, and aK{x)id — inf^.^gupp (2)<^ \\z- 
X^£d IS the best-K -term approximation error in £f. 

This result says that provided the stability relationship p.lSp . we can approximate 
the individual trajectories Xi{t), for each t S [0,T] fixed, by a vector xf{t) solution 
of an optimization problem of the type (j3.16p , and the accuracy of the approximation 
depends on the best-A'-term approximation error aK{xi{t))gd. Actually, the results 
in [121 HZ] in connection with [THl STJ , state also that this is asymptotically the 
best one can hope for. One possibility to improve the recovery error is to increase 
the dimension k (leading to a smaller distortion parameter e > in the Johnson- 
Lindenstrauss embedding). But we would like to explore another possibility, namely 
projecting and simulating in parallel and independently the dynamical system L-times 
in the lower dimension k 

N 

yl = M'f,{V'/)+Y,hA'D'/)yl yf(0) = A/^x°, €=1,...,L. (3.17) 

Let us give a brief overview of the corresponding error estimates. The number of 
points needed in each of the cases \s N ^ N x n^, where N is the number of agents 
and no = T /h is the number of iterations. 

• We perform 1 projection and simulation in M'^': Then e = C>{^^J'^^-^ , K = 
O ( i+\ogi^d/k) ) ^-"^"^ application of Theorem 13 . 71 leads to 



ii.,w-f«)iu. <c'(«) ^ (3,18) 



Here, C'{t) combines both the constants from Theorem 13.71 and the time- 
dependent C{t) from (|3.15p . So, to reach the precision of order C'{t)e > 0, we 

have to choose k gN large enough, such that y < e and — - < e. 

We then need k x iV^ operations to evaluate the adjacency matrix. 
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Wc perform 1 projection and simulation in 



pLxk. 



Then e' = O 



logA^ 
Lk 



and 



K' = O 



Lk 

l+log(d/Lfc) 



and an application of Theorem 13.71 leads to 



\\x.it)~xfit)\U<C'it) 



/log TV (TK'{Xt(t))f,^ 



Lk 



/K' 



(3.19) 



The given precision of order C'{t)e > 0, may be then reached by choosing 

k,L eN large enough, such that yf^^ < e and ^ < e. We then 

need Lk x A^^ operations to evaluate the adjacency matrix. 

We perform L independent and parallel projections and simulations in M'^': 



Then we assemble the following system corresponding to p.l7p 



Mx 



( \ 

M2 



( v\ \ 



\ yt J 



where for all 



1, . . . ,L the matrices £ 



pkxd 



are (let us say) ran- 



dom matrices with each entry generated independently with respect to the 
properly normalized Gaussian distribution as described in Section [21 Then 
m/Vl is a Lk X d matrix with Restricted Isometry Property of order K' = 

O ( i+io^(d/Lfc) ) S < 0.4627. The initial distortion of each of the 



projections is still e = O 
can compute xf{t) such that 



log TV 
k 



Therefore, by applying Theorem 13.71 we 



\Mt)-xfm,.<c'it) 



/log TV aK'{x^{t))^,a 



(3.20) 



Notice that the computation of xf{t) can also be performed in parallel, see, 
e.g., |29j . The larger is the number L of projections we perform, the larger 
is K' and the smaller is the second summand in (j3.20p : actually a^' {xi{t))gd 
vanishes for K' > d. Unfortunately, the parallelization can not help to reduce 
the initial distortion e > 0. To reach again the precision of order C'{t)e > 0, 

we have to choose fc G N large enough, such that 
chose L > 1 large enough such that 



logAf 
k 



/K' 



< e. Then we 
< e. We again need k x N-^ 



operations to evaluate the adjacency matrix. 
In all three cases, we obtain the estimate 



\x^it)~xtit)hd<C'it) 



crK{Xiit))fd 



(3.21) 



where the corresponding values of e > and K together with the number of operations 
needed to evaluate the adjacency matrix may be found in the following table. 
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number of operations 



1 projection into 



O 



logA^ 
fc 



o 



l+log(d/fe) 



1 projection into 



nLxfc 



o 



logAA 
Lk 



' i+iog(d/Lfc; 



Lfc X 



L projections into 



O 



log AT 
k 



o 



Lk 

l+\os(d/Lk] 



kx 



3.4.2. Optimal recovery of trajectories on smooth manifolds. In recent 
papers [H I59( I39j . the concepts of compressed sensing and optimal recovery were 
extended to vectors on smooth manifolds. These methods could become very useful in 
our context if (for any reason) we would have an apriori knowledge that the trajectories 
Xi {t) keep staying on or near such a smooth manifold. Actually this is the case, for 
instance in molecular dynamics, where simulations, e.g. in the form of the coordinates 
of the atoms in a molecule as a function of time, lie on or near an intrinsically-low- 
dimensional set in the high-dimensional state space of the molecule, and geometric 
properties of such sets provide important information about the dynamics, or about 
how to build low-dimensional representations of such dynamics |52[ I58j . In this case, 
by using appropriate recovery methods as described in [39], we could recover high- 
dimensional vectors from very low dimensional projections with much higher accuracy. 
However, this issue will be addressed in a following paper. 

3.5. Numerical experiments. In this section we illustrate the practical use 
and performances of our projection method for the Cucker-Smale system (|2.12p - (|2.13p . 

3.5.1. Pattern formation detection in high dimension from lower di- 
mensional projections. As already mentioned, this system models the emergence 
of consensus in a group of interacting agents, trying to align with their neighbors. The 
qualitative behavior of its solutions is formulated by this well known result [^[^155] : 

Theorem 3.8. Let {xi(t),Vi(t)) be the solutions of (|2.12p - (|2.13p . Let us define 
the fluctuation of positions around the center of mass Xc{t) ~ j^'^iLi^iit) , and, 
resp., the fluctuation of the rate of change around its average Vc{t) = jj X^i^i '^^ 



Mt) 



1 " 



Xc{t)\\% , 



1 ^ 



i=i 



Then if either 13 < 1/2 or the initial fluctuations A(0) and T(0) are small enough 
(see for details), then T{t) —> as t —> oo. 

The phenomenon of V{t) tending to zero as t — )■ oo is called flocking or emergence 
of consensus. If /? > 1/2 and the initial fluctuations are not small, it is not known 
whether a given initial configuration will actually lead to flocking or not, and the only 
way to find out the possible formation of consensus patterns is to perform numerical 
simulations. However, these can be especially costly if the number of agents N and 
the dimension d are large; the algorithmic complexity of the calculation is 0{dx N'^). 
Therefore, a significant reduction of the dimension d, which can be achieved by our 
projection method, would lead to a corresponding reduction of the computational 
cost. 

We illustrate this fact by a numerical experiment, where we choose = 1000 
and d = 200, i.e., every agent i is determined by a 200-dimensional vector Xi of its 
state and a 200-dimensional vector Vi giving the rate of change of its state. The 
initial datum (x^,v^) is generated randomly, every component of x^ being drawn 
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k=100 k=10 




200 150 100 50 25 10 5 2 200 150 100 50 25 10 5 

dimension dimension 



Fig. 3.2. Numerical results for /3 = 1.5; First row shows the evolution of r(t) of the system 
projected to dimension k = 100 (left) and fc = 10 (right) in the twenty realizations, compared to the 
original system (bold dashed line). Second row shows the initial values r(t = 0) and final values 
r(t = 30) in all the performed simulations. 



independently from the uniform distribution on [0, 1] and every component of v*^ being 
drawn independently from the uniform distribution on [—1,1]. We choose f3 — 1.5, 
1.62 and 1.7, and with every of these values we perform the following set of simulations: 

1. Simulation of the original system in 200 dimensions. 

2. Simulations in lower dimensions k: the initial condition (a;°,w'^) is projected 
into the /c-dimensional space by a random Johnson-Lindenstrauss projection 
matrix M with Gaussian entries. The dimension k takes the values 150, 100, 
50, 25, 10, 5, and 2. For every k, we perform the simulation twenty times, 
each time with a new random projection matrix M . 

All the simulations were implemented in MATLAB, using 1500 steps of the forward 
Euler method with time step size 0.02. The paths of r(t) from the twenty experiments 
with k = 100 and fc = 25 or fc = 10 are shown in the first rows of Figs. I3.2[ 13.31 and. 
rcsp.. l5^ for /3 = 1.5, 1.62 and, resp., 1.7. 

The information we are actually interested in is whether flocking takes place, in 
other words, whether the fluctuations of velocities T{t) tend to zero. Typically, after an 
initial phase, the graph of T(t) gives a clear indication either about exponentially fast 
convergence to zero (due to rounding errors, "zero" actually means values of the order 
2^0-30 jj^^ ^Yic simulations) or about convergence to a positive value. However, in certain 
cases the decay may be very slow and a very long simulation of the system would be 
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200 150 100 50 25 10 5 2 200 150 100 50 25 10 5 2 

dimension dimension 



Fig. 3.3. Numerical results for /3 = 1.62; First row shows the evolution ofr{t) of the system 
projected to dimension k = 100 (left) and k = 25 (right) in the twenty realizations, compared to the 
original system (bold dashed line). Second row shows the initial values r(t = 0) and final values 
r(t = 30) in all the performed simulations. 

needed to see if the limiting value is actually zero or not. Therefore, we propose the 
following heuristic rules to decide about flocking from numerical simulations: 

• If the value of F at the final time t = 30 is smaller than 10"^", wc conclude 
that flocking took place. 

• If the value of r(30) is larger than 10"'^, we conclude that flocking did not 
take place. 

• Otherwise, we do not make any conclusion. 

In the second rows of Figs. l3.2[|3.3l and l3.4l we present the initial and final values of F of 
the twenty simulations for all the dimensions k, together with the original dimension 
d ~ 200. In accordance with the above rules, flocking takes place if the final value 
of F lies below the lower dashed line, does not take place if it lies above the upper 
dashed line, otherwise the situation is not conclusive. The results are summarized in 
Table O 

Experience gained with a large amount of numerical experiments shows the fol- 
lowing interesting fact: The flocking behavior of the Cucker-Smale system is very 
stable with respect to the Johnson-Lindenstrauss projections. Usually, the projected 
systems show the same flocking behavior as the original one, even if the dimension is 
reduced dramatically, for instance from d = 200 to fc = 10 (see Figs l3.2l and l3.4p . This 
stability can be roughly explained as follows: Since the flocking behavior depends 
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200 150 100 50 25 10 

dimension 



i f 



I I 



200 150 100 50 25 10 5 2 

dimension 



Fig. 3.4. Numerical results for /3 = 1.7; First row shows the evolution of r(t) of the system 
projected to dimension k = 100 (left) and fc = 10 (right) in the twenty realizations, compared to the 
original system (bold dashed line). Second row shows the initial values r(t = 0) and final values 
r(t = 30) in all the performed simulations. 



mainly on the initial values of F and A, which are statistical properties of the random 
distributions used for the generation of initial data, and since N is sufRciently large, 
the concentration of measure phenomenon takes place. Its effect is that the initial 
values of the fluctuations of the projected data are very close to the original ones, and 
thus the flocking behavior is (typically) the same. There is only a narrow interval of 
values of /? (in our case this interval is located around the value /? — 1.62), which is 
a borderline region between flocking and non-flocking, and the projections to lower 
dimensions spoil the flocking behavior, see Fig l3.3l Let us note that in our simulations 
we were only able to detect cases when flocking took place in the original system, but 
did not take place in some of the projected ones. Interestingly, we never observed the 
inverse situation, a fact which we are not able to explain satisfactorily. In fact, one 
can make other interesting observations, deserving further investigation. For instance. 
Figs. 13.2 1 and 13. 31 show that if the original system exhibits flocking, then the curves of 
r(t) of the projected systems tend to lie above the curve of T{t) of the original one. 
The situation is reversed if the original system does not flock, see Fig. 13.41 

From a practical point of view, we can make the following conclusion: To obtain an 
indication about the flocking behavior of a highly dimensional Cucker-Smale system, 
it is typically satisfactory to perform a limited number of simulations of the system 
projected into a much lower dimension, and evaluate the statistics of their flocking 
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Statistics of the flocking behaviors of the systems in the original dimension d = 200 and in 
the projected dimensions. With /3 = 1.5 and fi = 1.62, the original system (d = 200) exhibited 
flocking behavior. With /3 = 1.5, even after random projections into 25 dimensions, the system 
exhibited flocking in all 20 repetitions of the experiment, and still in 14 cases in dimension 10. With 
/3 = 1.62, the deterioration of the flocking behavior with decreasing dimension was much faster, 
and already in dimension 25 the situation was not conclusive. This is related to the fact that the 
value P = 1.62 was chosen to intentionally bring the system close to the borderline between flocking 
and non-flocking. Finally, with jS = 1.7, the original system did not flock, and, remarkably, all the 
projected systems (even to two dimensions) exhibit the same behavior. 



behavior. If the resuh is the same for the majority of simulations, one can conclude 
that the original system very likely has the same flocking behavior as well. 



Relative error of projection of the v-variabies 



Reiative error of reconstruction of the v-variabies 
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Fig. 3.5. Numerical results showing the time evolution of the relative error of projection (left 
panel) and relative error of recovery via £i-minimization (right panel) of the v-variables. 



3.5.2. Numerical validation of the high and low dimensional approxi- 
mation properties. Finally, we show how the relative error of projection and re- 
covery evolves in time. We consider an initial datum € R^^*^ x M^^'* for 
the Cuckcr-Smalc system with N = d = 200 and randomly generated entries from 
the normal distribution. The parameter (3 ~ 0.4, therefore, the system will exhibit 
flocking. First we project the system into k = 20, 40, 60, 100, 140, 180 dimensions and 
calculate the relative error of the projection of the w-variables, given by 

j:Ijmu,\\% ) 
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Wc observe that the maximal relative error (for k = 20) is around 14%, which we 
consider as a very good result. Moreover, in all 9 cases, the error first increases, but 
after t ~ 22 it starts decreasing, which is a consequence of the flocking behavior and 
concentration of measure, see the graphics in Figurc l375l on the left. This clearly shows 
that the worst-case estimate of Theorem 12 . 2 1 with exponential growth in time is overly 
pessimistic. 

In our second experiment, we take a randomly generated initial condition with 
N = d = 200 and 80% of the entries set to zero. Then, we take L projections of the 
system into 20 dimensions, with L = 1, 2, 3, 5, 7, 9, and reconstruct the u-trajcctorics 
using £i-minimization, as described in Section [3.4.1l In the graphics in Figure |375] on 
the right, we plot the relative errors, given by 



where Vi are the recovered trajectories. Again, we observe that the errors grow much 
slower than exponentially, and after < ~ 15 they even tend to stay constant or slightly 
decrease. 

4. Mean-field limit and kinetic equations in high dimension. In the pre- 
vious sections we were concerned with tractable simulation of the dynamical systems 
of the type (|1.1[) when the dimension d of the parameter space is large. Another source 
of possible intractability in numerical simulations appears in the situation where the 
number of agents N is very large. In general, large N imposes even a much more se- 
vere limitation than large d, since the computational complexity of (jl.ip is 0{d x N'^). 
Therefore, in the next sections we consider the so-called mean-field limit of (jl.ip as 
— > oo, where the evolution of the system is described by time-dependent probability 
measures iJ.{t) on R'', representing the density distribution of agents, and satisfying 
mesoscopic partial differential equations of the type ()4.1|) . This strategy originated 
from the kinetic theory of gases, see [16] for classical references. We show how our 
projection method can be applied for dimensionality reduction of the corresponding 
kinetic equations and explain how the probability measures can be approximated by 
atomic measures. Using the concepts of delayed curse of dimension and measure 
quantization known from optimal integration problems in high dimension, we show 
that under the assumption that the measure concentrates along low-dimensional sub- 
spaces (and more generally along low-dimensional sets or manifolds), it can be ap- 
proximated by atomic measures with sub-exponential (with respect to d) number of 
atoms. Through such approximation, we shall show that we can approximate suitable 
random averages of the solution of the original partial differential equation in high 
dimension by tractable simulations of corresponding solutions of lower-dimensional 
kinetic equations. 

Another interesting approach to the problem of efficient numerical simulation of 
large group dynamics is the so-called "equation- free" approach, see e.g. [47] . Here, 
convenient coarse-grained variables that account for rapidly developing correlations 
during initial transients are chosen, in order to perform efficient computations of 
coarse-grained steady states and their bifurcation analysis. The big advantage of the 
equation-free approach is that the coarse-grained dynamics can be explored without 
the assumption of the continuum limit equation as we consider here. The premise of 
the method is that coarse-grained governing equations conceptually exist, but arc not 
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explicitly available in closed form. The main idea is that short bursts of appropri- 
ately initialized microscopic (fine-scale) simulations and the projection of the results 
onto coarse-grained variables result in time-steppers (mappings) for those variables 
(which is effectively the same as the discretization of the unavailable equations) . One 
then processes the results of the short simulations to estimate various coarse-grained 
quantities (such as time derivatives, action of Jacobians, residuals) to perform rele- 
vant coarse-grained level numerical computations, as if those quantities were obtained 
from coarse-grained governing equations. 

4.1. Formal derivation of mean-field equations. In this section we briefly 
explain how the mean-field limit description corresponding to can be derived. 

This is given, under suitable assumptions on the family of the governing functions 

~ {fii fij '■ j = i, - ■ ■ N}, by the general formula 

1^ + V • (H^[/i]M) - 0, (4.1) 

where 'Hj^[fi] is a field in R'^, determined by the sequence ~ {J^N)NeN- 

In order to provide an explicit example, we show how to formally derive the mean 
field limit of systems of the type 

Xi=V^, (4.2) 
N N 

^» - E f^fi1^^^^^v)v, + J2 flTiT^^)^! ' (4-3) 

with 

fl^Vx) = JjY^uiWx, ~ x,\\,.) + ^—^u{\\x, x,\\,.) , 



( 1 

n;\Vx,Vv)^8Ah{\v,,\\\^--Y^g{ 



Note that for suitable choices of the functions h,g,u this formalism includes both the 
Cucker-Smale model (|2.12[) - (|2.13p and the self-propulsion and pairwise interaction 
model (|3.13p - (|3.14p . We define the empirical measure associated to the solutions 
Xi{t), v,{t) of g21)~g^ as 

1 ^ 

Taking a smooth, compactly supported test function ^ S C^{S?''^) and using (|4.2p - 
(|4.3p . one easily obtains by a standard formal calculation (see |14j ) 



(4.4) 

V.A{x,v)-vd^i^{t,x,v)+ / W„^{x,v)■H[^i^{t)]{x,v)A^l^{t,x,v), 



R2d 

with 

'H[y]{x,v) = h{\\v\\ia)v + I g(\\x-y\\ia){w-v)d^i{y,w)+ / u{\\x - y\\iA{y - x) d^l{y,w) . 
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Wc now assume weak convergence of a subsequence of (/i^(t))ArgN to a time-dependent 
measure /i(t) — fi(t, v) and boundedness of its first order moment, which indeed can 
be estabHshed rigorously for the Cucker-Smale and the self-propulsion and pairwise 
interaction systems (see [38], [50]). Then, passing to the limit — >■ c» in (|4.4p . one 
obtains in the strong formulation that fi is governed by 

Ofjj 

— {t,x,v) +v ■ 'Vxf^{t,x,v) + Vi, • {n[iJ.{t)]{x,v)ii{t,x,v)) = 0, 

which is an instance of the general prototype (j4.ip . 

Using the same formal arguments as described above, one can easily derive mean 
field limit equations corresponding to (jl.ip with different choices of the family J-. 

4.2. Monge-Kantorovich-Rubinstein distance and stability. In several 
relevant cases, and specifically for the Cuckcr-Smale and the self-propulsion and pair- 
wise interaction systems [T3], solutions of equations of the type (|4.ip are stable with 
respect to suitable distances. We consider the space 'Pi{R'^), consisting of all proba- 
bility measures on M'' with finite first moment. In 'Pi{M.'^) and for solutions of (|4.ip . a 
natural metric to work with is the so-called Monge-Kantorovich-Rubinstein distance 
(also called Wasserstein distance) [57], 



Wi{n,iy) ~ sup{|(/x- 1^,01 



^(x)d(pi - v){x) 



,eeLip(M^),Lip(0<l}. 



(4.5) 

We further denote Vd^'^) the space of compactly supported probability measures on 
W^. In particular, throughout the rest of this paper, we will assume that for any 
compactly supported measure valued weak solutions n{t),v{t) e C{[0,T],'Pc{R'^)) of 
(|4.ip we have the following stability inequality 

Wli^iit),lyit))<c{t)Wl{^lio),l^io)), te[o,T], (4.6) 

where C{t) is a positive increasing function of t with C(0) > 0, independent of the 
dimension d. We address the interested reader to [TS] Section 4] for a sample of general 
conditions on the vector field H{J^]{ij.) which guarantee stability (|4.6p for solutions of 
equations (|4.1[) . 

4.3. Dimensionality reduction of kinetic equations. Provided a high-dimensional 
measure valued solution to the equation 

|^+V.(H^[M]Ai)=0, ^l{0) = ^IoeVciR''), (4.7) 

we will study the question whether its solution can be approximated by suitable 
projections in lower dimension. 

Given a probability measure fi G 7^i(M'*), its projection into M.'' by means of a 
matrix M : — > M'^ is given by the push-forward measure := M^fi, 

{fiM,(p) f{M-)) for all ip e Lip(M*-'). (4.8) 

Let us mention two explicit and relevant examples: 

• If /i^ — jj X)i3=i is an atomic measure, we have (/^j^, f) = {iJ,^ , tf{M-)) = 



N 

TfY^=iV{Mx,). Therefore, 

N 



^M = ^E'^^^-- (4-9) 

1=1 
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If fi is absolutely continuous with respect to the Lebesgue measure, i.e., it is a 
function in L^(&.'^), the calculation requires a bit more effort: Let us consider 
the pseudo-inverse matrix of M. Recall that = M*{MM*)-'^ is a 
right inverse of M, and M^M is the orthogonal projection onto the range of 
M* . Moreover, x = APMx + £,x, where £,x & kerM for aU x G R"*. According 
to these observations, we write 

f{Mx)fi{x)dx^ [ ip{Mx)fi{M^ Mx + £_x)dx 

ip{Mx)fi{MHlx + £,x)dx 

ranA/*0kor M 

)dv^dv 



ranA/* J kcr M 



Note now that M|,.anAf* ■ ranM* — >■ ranM ^ R is an isomorphism, hence y = 
Mv implies the change of variables dv = dei{M\^^^M')^^dy = det{MM*)~^^^dy. 
Consequently, we have 



(p{Mx)n{x)dx = I ip{Mx)fi{M^ Mx + ^x)dx 

/ ip{Mv)n{M''Mv + v-^)dv-^dv 

lA/* JkcrA/ 

(dS(]v7W^L„''<*'''' + ''')*")''''')*' 



and 



^-^(^) = det(MM-)V^ X„./-("^^^ + "")^-" 

According to the notion of push-forward, we can consider the measure valued function 
ly G C([0,T],7'c(R'')), solution of the equation 

+ V • {n^,, Miy) = 0, 1^(0) = Mm e VciR"), (4.10) 

ot 

where {^io)M = A/#/-io and Tm = {{Mfi, fij,i,j = l,...,N})Nm- As for the 
dynamical system (|3.5p , also equation (|4.10p is fully defined on the lower-dimensional 
space R'^ and depends on the original high-dimensional problem exclusively by means 
of the initial condition. 

The natural question at this point is whether the solution i/ of (|4.10p provides 
information about the solution /i of (|4.7p . In particular, similarly to the result of 
Theorem 13. 11 we will examine whether the approximation 

iy{t) ^ ^iM{t), te[0,T], 

in Monge-Kantorovich-Rubinstein distance is preserved in finite time. We depict the 
expected result by the following diagram: 

m Kt) 

IM i M 

Z^(0) ^ ino)M v{t) W ^lM{t) . 
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This question will be addressed by approximation of the problem by atomic measures 
and by an application of Theorem 13.11 for the corresponding dynamical system, as 
concisely described by 

W'i(M.M")<e AT 

n — > M 

^iM — > V ~ Mm 

Let us now recall the framework and general assumptions for this analysis to be 
performed. We assume again that for all € N the family = {fi,fij : i,j = 
1, . . .N} is composed of functions satisfying (|1.2p - (|1.4p . Moreover, we assume that 
associated to = (J^jv)AreN and to 

N 

±,{t) MVx{t)) + Y,.Uji'Dx{t))x,{t), (4.11) 
we can define a mean-field equation 

|^ + V-(H[J-](m)m) = 0, ^,{0) = ^i,eVc{R'), (4.12) 

such that for any compactly supported measure valued weak solutions fi{t),i>{t) € 
C{[0,T],'Pc{R'^)) of dill) we have the following stability 

Wli^lit),lyit))<cit)Wli^lio),l^io)), te[o,T], (4.13) 

where C(t) is a positive increasing function of t, independent of the dimension d. 
We further require that corresponding assumptions, including stability, hold for the 
projected system ()2.5p and kinetic equation (|4.10p . Then we have the following ap- 
proximation result: 

Theorem 4.1. Let us assume that fio G Vd^'^) and there exist points {ccj, . . . , x^} C 
, for which the atomic measure fiQ = X^i^i ^x" approximates fiQ up to e > in 
Monge-Kantorovich-Rubinstein distance, in the following sense 

Wi{fio,f^o) <s, N ^M^'^^^ fork{e)<d andk{e)^dfore^Q. (4.14) 

Requirement (|4.14p is in fact called the delayed curse of dimension as explained below 
in detail in Section \4-5\ Depending on e > we fix also 



k = k{e) = ©(£-2 log(A^)) = ©(£-2 \og{M)k{e)). 

Moreover, let M : — > M.^ be a linear mapping which is a continuous Johnson- 
Lindenstrauss embedding as in (|3.4p for continuous in time trajectories Xi (t) of (|4.1ip 
with initial datum Xi{0) = x^ . Let v € C([0, T], 7'c(M'^)) be the weak solution of 

■{H[Fm]{^)^)^Q, (4.15) 

v{Q) = {po)m e VaiR''), (4.16) 
where {p-o)m = ^-^#Mo- Then 

WiifiMit),i^{t))<Cit)\\M\\e, t€[0,T], (4.17) 
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where C{t) is an increasing function oft, with C(0) > 0, which is at most polynomially 
growing with the dimension d. 

Proof. Let us define {t) the solution to equation (j4.15p with initial datum 
1/^(0) = {ij,q)m, or, equivalently, thanks to (|4.9p 



where yi(t) is the solution of 



1 = 1 



TV 

y, = f,{V'y) + J2 h {'D'y)Vi > ^ = 1, . . . , iV , 
y,(0) = Mx°, z = l,...,7V. 

We estimate 

w^{^iM{t),v{t))<w,{^lM{t),{^i''{t))M) + w^{{^l^{t))M.l^''{t)) + w^^^^ 

By using the definition of push- forward (|4.8p and (|4.14p . the first term can be esti- 
mated by 

w^iiJLMit), {^J^''{t))M) = sup{(MAfW - {^Ji''{t))M.v) ■■ Lip(^) < 1} 

= sup{(^(t) - ^"^(0,^(1/.)) : LiplV') < 1} 
<\\M\\W,{^i{tl^,''{t))<\\M\\C{t)e. 

We estimate now the second term 

M^i((/x^W)m,;^'^(<)) = sup{((M^(t))M - v''{t),v) : Lip(^) < 1} 

1 ^ 

= "'^P^^ W) - '/'(y. W)) : Lip(<^) < 1} 



1=1 



i=l 

We recall the uniform approximation of Theorem 13. H 

\\Mx,{t) - y,{t)\\,. <D{t)e, i = l,...,N, 

where D{t) is the time-dependent function on the right- hand-side of p.7p . Hence 

Wi{fiM{t)At^^it))M)<D{t)e. 

We address now the upper estimate of the third term, by the assumed stability of the 
lower dimensional equation (j4.10p 

Wi{,^^it),iy{t)) < Cit)Wi{iy''iO),iym 

= C{t)Wi{{fl^)M,{flo)M) 

<C{t)\\M\\W{fi^,fio)<CmM\\e. 

We can fix C{t) ~ 2C(t)||Af || + D{t), and, as observed in Theorem 13.31 we can assume 
without loss of generality that < Hence, C{t) depends at most polynomially 
with respect to the dimension d. □ 
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4.4. Approximation of probability measures by atomic measures and 
optimal integration. In view of the fundamental requirement (|4.14p in Theorem 
14.11 given hq G 'Pc(R'^), we arc interested to estabhsh an upper bound to the best pos- 
sible approximation in Monge-Kantorovich-Rubinstein distance by means of atomic 



measures 



N 



N 



12iLo^ (5^.0 with atoms, i.e. 



inf 

N 1 s:^N — 1 X 



(4.18) 



= inf sup{| / a^)d^,oix) - ^ E ^ ^ Lip(K'):Lip(C) < l} 



In fact, once we identify the optimal points {a 



0' ■ 



'-A'-i}' '^^^ them as initial 



conditions Xi{0) 
relationship (|4.6 



for the dynamical system (|4.1ip . and by using the stability 



we obtain 



Wl{^l{t),^I'^t))<ciT)Wlipo,^^o), te[o,T], 



(4.19) 



where fJ^'^ {t) = jf'^iLo^^xi{t)j meaning that the solution of the partial differential 
equation (|4.ip keeps optimally close to the particle solution of (|4.1ip also for suc- 
cessive time t > 0. Note that estimating (|4.18p as a function of is in fact a very 
classical problem in numerical analysis well-known as optimal integration with its 
high-dimensional behaviour being a relevant subject of the field of Information Based 
Complexity [491155]. 

The numerical integration of Lipschitz functions with respect to the Lebesgue 
measure and the study of its high-dimensional behaviour goes back to Bakhvalov [2], 
but much more is known nowadays. We refer to |33| and |36| for the state of the art 
of quantization of probability distributions. 

The scope of this section is to recall some facets of these estimates and to refor- 
mulate them in terms of Wi and £n- We emphasize that here and in what follows, 
we consider generic compactly supported probability measures ^, not necessarily ab- 
solutely continuous with respect to the Lebesgue measure. We start first by assuming 
d = 1, i.e., we work with a univariate measure /i € 'Pc(lR) with support supp fi C [a, b] 
and a := b — a > 0. We define the points xq, . . . , xn-i as the quantiles of the proba- 
bility measure i.e., xq :— a and 



N 



dtJ.{x), 



,N-l. 



(4.20) 



This is notationally complemented by putting xn 
j^^*^ dfi{x) = j^,i = 0, . . . , N — 1, and we have 



b. Note that by definition 



^{x)d^i{x) - j^Yl ^(^') 



< 



Af-l 

E 

i=0 
Af-1 

1=0 



ia^) - ax^))d^Ji{x) 



\^{x)~^{x,)\d^l{x) 



Lip(^) 

N 



N-l 



{xi+1 - Xt) = 



gLip(0 
N ' 



(4.21) 
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Hence it is immediate to see that 

£N{fi)= inf w^{^l,^l^)<^. 

We would like to extend this estimate to higher dimension d > 1. However, for 
multivariate measures there is no such an easy upper bound, see [33] and [35] for 
very general statements, and for the sake of simplicity we restrict here the class of 
measures n to certain special cases. As a typical situation, we address tensor product 
measures and sums of tensor products. 

Lemma 4.2. Let ^i,...,/ e ViiR) with < £j, j ^ ■ ■ ■ ,d for 

some iVi, . . . , iVd e N, ei, . . . , Ed > and /x^-^^ := ^ E^=o'' ■ Let N = Jlti 
Then 

d 

Wl{fl^®■■■®^i'^,^l^)<J2^J' 

where 

1 . . 

:= — ^ (5^ and X := J|{a;^, . . . , a;]^_ J. 

xex j=i 



Proof. The proof is based on a simple argument using a telescopic sum. For 
J = 1 , . . . , d 4- 1 we put 



vr--- 



1 ^ " /■ 



Of course, if j = 1, then the integration over W ^ is missing and if j = d + 1 then 
the summation becomes empty. Now 



„ -, Ni-l a 

/ mdKx) - -r— E ■ ■ ■ E ^(4 = E(^.-+i - 



nti .t^o 



together with the estimate |Vj+i — V, | < finishes the proof. □ 

Lemma l4.2l savs. roughly speaking, that the tensor products of sampling points of 

univariate measures are good sampling points for the tensor product of the univariate 

measures. Next lemma deals with sums of measures. 

Lemma 4.3. Let ^i, . . . G Vi{R'') with Wi{ni,fij^) < ei, I = 1, ... ,L /or 

some A G N, £i, . . . , £l > and := X^ilo^ ^a^i ; • Then 



L 

L 
1=1 

where 

L L 



7^ E '^^ = tE/^'^ a^icf A |J{a;i,o,---,a;i,Ar-i}. 



^ ■ LN^^^ L 

xex 1=1 1=1 
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Proof. We use the homogeneity of the Monge-Kantorovich-Rubinstein distance 
Wi{afi, av) = aWi{(i, v) for n^v ^ 7'i(R'') and a > combined with its subadditivity 
Wi{iJ,i+ IJ2, 1^1 + ^2) < Wi{ni,vi) + Wi{ii2,V2) for /^i,/^2,i^i,J^2 e ViiW^). We obtain 

;=i 1=1 

□ 

Next coroUary follows directly from Lemma 14.21 and Lemma 14.31 
Corollary 4.4. (i) Let n^, . . . , e 7'i(M) and Ni,...,NdeN. Then 

d 

EN{^i^ ®■■■®^Ji'^)<^£N,{^i^), where N -.^ Ni ■ ■ ■ Nd. 
(a) Let fii,...,fiL€ ViiR'^) and N e N. Then 

1=1 

4.5. Delayed curse of dimension. Although Lemma 14.21 Lemma 14.31 and 
Corollary 14. 41 give some estimates of the Monge-Kantorovich-Rubinstein distance be- 
tween general and atomic measures, the number of atoms needed may still be too 
large to allow the assumption (|4.14p in Theorem 14.11 to be fulfilled. Let us for exam- 
ple consider the case, where jj} = ■ ■ ■ = ^'^ hi Lemma [4.21 and £1 = • • • = =: e. 
Then, of course, A'^i = • • • = Nd =: Af and we observe, that the construction given in 
Lemma 14.21 gives an atomic measure, which approximates // up to the error de using 
N''^ atoms, hence with an exponential dependence on the dimension d. This effect is 
another instance of the well-known phenomenon of the curse of dimension. 

However, in many real-life high-dimensional applications the objects of study 
(in our case the measure fj, S Vc{R'^)) concentrate along low-dimensional subspaces 
(or, more general, along low-dimensional manifolds) [Sj El [El [20l [21]. The number 
of atoms necessary to approximate these measures behaves in a much better way, 
allowing the application of (|4.14p and Theorem 14.11 To clarify this effect, let us 
consider ij = /^"'^ • • • (g) /i'' with supp/x^ C [oj, bj] and define aj = bj — aj. Let us 
assume, that cri > (72 5^ ■ ■ ■ 5^ f d > is a rapidly decreasing sequence. Furthermore, 
let e > 0. Then we define k := fc(e) to be the smallest natural number, such that 

d 

J2 '^k< e/2 

k=k{s) + l 

and put Nk = 1 for k E {k{e) + 1, . . . , d}. The numbers iVi = • • • = ^^(^j = A/" are 
chosen large enough so that 

77 E ^ 
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Then Lemma 1321 together with (|4.20p state that there is an atomic measure with 

N = TV*'^^) atoms, such that 

d 

W,{^Ji,^l'')<Y^^<e/2 + e/2. (4.22) 

fe=i ^ 

Hence, at the cost of assuming that the tensor product measure /i is concentrated 
along a fc(e)-dimensional coordinate subspace, we can always approximate the mea- 
sure jjL with accuracy e by using an atomic measure supported on points whose number 
depends exponentially on = fc(e) <C d. However, if we liked to have e — >■ 0, then 
k{e) ^ d and again we are falling under the curse of dimension. This delayed kicking 
in of the need of a large number of points for obtaining high accuracy in the ap- 
proximation (j4.22p is in fact the so-called delayed curse of dimension^ expressed by 
assumption (|4.14p . a concept introduced first by Curbera in [24], in the context of 
optimal integration with respect to Gaussian measures in high dimension. 

Let us only remark, that the discussion above may be easily extended (with help 
of Lemma [4. Bp to sums of tensor product measures. In that case we obtain as atoms 
the so-called sparse grids, cf. [10]. Using suitable change of variables, one could also 
consider measures concentrated around (smooth) low-dimensional manifolds, but this 
goes beyond the scope of this work, see [33] for a broader discussion. 
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